Length of a string

The len() function is a built-in utility in Python that returns the length or size of an object. It‘s most commonly used with strings to count the number of characters, but also works with lists, tuples, sets, dictionaries, and a few other types. Effectively using len() is a fundamental skill for any Python developer.

In this in-depth guide, we‘ll explore everything you need to know about len() – from the basics of its syntax and common use cases, to some more advanced concepts and performance considerations. Let‘s dive in!

Basic Usage and Syntax

The len() function takes a single argument – the object you want to measure the length of – and returns an integer result. The syntax couldn‘t be simpler:

len(obj)

Here are a few quick examples of using len() with different built-in types:

print(len("Hello World")) # Output: 11

print(len([1, 2, 3, 4, 5])) # Output: 5

print(len((1, 2, 3))) # Output: 3

print(len({1, 2, 3})) # Output: 3

print(len({"a": 1, "b": 2, "c": 3})) # Output: 3

As you can see, len() returns the number of characters for a string, the number of elements for a list/tuple/set, and the number of key-value pairs for a dictionary. We‘ll explore what the length means for each type in more detail later on.

Using len() with Strings

Strings are probably the most common use case for len(). When you pass a string to len(), it simply counts and returns the total number of characters in the string, including spaces, punctuation, and special characters.

message = "Welcome to Python!"
print(len(message)) # Output: 18

The length is 18 because there are 18 characters total in the string, including the spaces and exclamation point.

What about empty strings? As you might expect, len() returns 0 since there are no characters:

empty_string = ""
print(len(empty_string)) # Output: 0

It‘s important to understand that len() counts the number of characters in the string, not the number of bytes. This distinction matters if your string contains non-ASCII characters like emojis or Chinese symbols, since those can use multiple bytes to represent a single character.

name = "吉姆" # Chinese name
print(len(name)) # Output: 2

Even though the Chinese characters likely take up more than 1 byte each in memory, len() correctly reports the length as 2 since there are 2 characters.

Using len() with Lists, Tuples, and Sets

When working with the built-in sequence types in Python – lists, tuples, and sets – len() returns the number of elements in the sequence.

numbers = [1, 2, 3, 4, 5] print(len(numbers)) # Output: 5

coords = (10, 20, 30)
print(len(coords)) # Output: 3

fruits = {"apple", "banana", "orange"}
print(len(fruits)) # Output: 3

Again, empty sequences will have a length of 0:

empty_list = [] print(len(empty_list)) # Output: 0

One thing to watch out for is that len() only counts the top-level elements. If your list/tuple contains nested sub-sequences, those all just count as 1 element from the perspective of the outer sequence:

nested = [[1, 2, 3], [4, 5, 6]] print(len(nested)) # Output: 2
print(len(nested[0])) # Output: 3

Here the outer list nested has a length of 2, since it contains 2 elements (the inner lists). To get the length of the first inner list, we have to use nested[0] to index into it first.

Using len() with Dictionaries

For dictionaries, len() returns the number of key-value pairs currently stored in the dictionary.

ages = {"Alice": 30, "Bob": 25, "Charlie": 35}
print(len(ages)) # Output: 3

It‘s important to understand that len() only counts the top-level keys in the dictionary. If any of your values are lists, tuples, sets or nested dictionaries, those don‘t get counted recursively.

d = {"a": [1, 2, 3], "b": {1, 2}}
print(len(d)) # Output: 2

Even though the values for "a" and "b" contain multiple elements, len(d) still just returns 2 since there are only 2 top-level keys.

Common Mistakes and Gotchas

One of the most common mistakes with len() is trying to pass it the wrong type of argument. len() expects an object with a len method or that implements the sequence protocol. If you try to pass an integer, floating point number, or other unsupported type, you‘ll get a TypeError:

n = 42
print(len(n)) # TypeError: object of type ‘int‘ has no len()

Another gotcha is that the length of a sequence can change if you modify it. Be careful not to cache the length and expect it to stay constant:

numbers = [1, 2, 3] print(len(numbers)) # Output: 3

numbers.append(4)
print(len(numbers)) # Output: 4

Here the length of the numbers list changes from 3 to 4 after we append another element to it.

Performance Considerations

In most cases, len() is a very fast O(1) operation. That‘s because the length of an object is usually stored as an attribute and simply returned by len, rather than having to be recalculated each time.

However, this isn‘t always the case. Some custom sequence types may implement len in a way that requires traversing the whole sequence to determine the length, which would be an O(n) operation.

Also keep in mind that creating a new list, tuple, set or dictionary just to check its length is much slower than using len() on an existing sequence. So this:

n = len(some_sequence)

Is much more efficient than this:

n = len(list(some_sequence))

Since the latter requires creating a whole new list object just to check the length.

Creative Uses of len()

Beyond just checking the length of an object, len() can be used in some creative ways in Python. Here are a few examples:

Checking if a sequence is empty:
Instead of checking if a list or other sequence equals [], you can use len() and take advantage of the truthiness of integers in Python:

if len(some_list) == 0: # Clunky
print("The list is empty")

if not len(some_list): # More pythonic
print("The list is empty")

Since len() always returns an int, and 0 is falsy while any other integer is truthy, we can leverage that to write more concise checks for empty sequences.

Initializing lists of a certain size:
If you need to initialize a list of a specific size, you can use len() in combination with a list comprehension or the * operator:

n = 10
my_list = [0] * n # n zeros
mylist = [0 for in range(n)] # equivalent list comprehension

This sets my_list to a list of n elements, each initialized to 0.

Truncating strings:
You can use len() to truncate a string to a maximum size:

def truncate(s, max_length):
return s[:max_length] if len(s) > max_length else s

This function returns s unchanged if it‘s less than or equal to max_length, otherwise it truncates s to max_length characters.

How len() is Implemented

Under the hood, len() is actually a very thin wrapper around a method named len (that‘s two underscores on either side). When you call len(obj), it effectively just does this:

return obj.len()

The len method is part of a protocol in Python called the sequence protocol. Built-in types like str, list, dict, tuple, etc. all implement this protocol, which is why len() works on them.

You can even implement len on your own custom classes to make them compatible with len():

class MyClass:
def init(self, data):
self.data = data

def __len__(self):
    return len(self.data)

obj = MyClass([1, 2, 3])
print(len(obj)) # Output: 3

Here we define a custom len method that just returns the length of the self.data attribute. Now len() works on instances of MyClass!

The actual implementation of len for built-in types like list and dict is done in optimized C code in the CPython interpreter. But at the Python level, you can think of it as just returning the value of an internal size or length attribute that gets updated as the object is modified.

Conclusion

As you can see, there‘s a lot to unpack about what seems like a very simple built-in function! len() is an extremely common tool in most Python code, so it‘s important to be familiar with its use cases, performance characteristics, and quirks.

Some key points to remember:

  • len() returns the number of characters for strings, number of elements for sequences like lists and tuples, and number of key-value pairs for dictionaries
  • len() only measures the length of the top-level object, not any nested objects
  • len() is usually O(1), but can be O(n) for some custom sequence types
  • Empty sequences and collections have a length of 0
  • You can implement len on your own classes to support len()

I hope this deep dive has helped you fully understand and appreciate the len() function in Python. Stay curious and keep coding!

Similar Posts