Python Sets: A Detailed Visual Introduction

Python provides several built-in data structures that allow you to store and manipulate collections of data in different ways. One of the most useful is the set – an unordered collection of unique elements. Sets make it easy to perform mathematical set operations and remove duplicate values from other collections.

In this article, we‘ll take an in-depth look at Python sets, understand their key properties, and see how to leverage them effectively in your code. Let‘s dive in!

What are Sets?

A set in Python is an unordered collection of unique, immutable elements. The key properties that define a set are:

  1. Elements are unordered – the sequence in which elements were added doesn‘t matter
  2. Each element must be unique – duplicates are not allowed
  3. Elements must be immutable – cannot be changed once added to the set

Here‘s how a set differs from other built-in collections in Python:

  • Unlike lists and tuples, sets are unordered
  • Unlike lists, sets cannot contain duplicate values
  • Unlike dictionaries, sets only store values (no keys)

Internally, sets are implemented using hash tables. This allows set operations like membership tests and element deletion to be performed in constant O(1) time on average.

Creating Sets

There are two ways to create a set in Python – using the set() constructor or set literal syntax.

Using set() Constructor

To create a set from any iterable, pass it to the set() constructor:

>>> set([1, 2, 3])
{1, 2, 3}

>>> set((1, 2, 2, 3))  # duplicates removed 
{1, 2, 3}

>>> set(‘hello‘)
{‘o‘, ‘e‘, ‘h‘, ‘l‘}

>>> set({‘a‘: 1, ‘b‘: 2})  # dict keys become set elements
{‘a‘, ‘b‘}

>>> set(range(3))
{0, 1, 2}

Passing an iterator to set() will consume it and store the yielded values. Repeated values will only appear once in the result.

To create an empty set, you must use set(), not empty {}:

>>> type({})
<class ‘dict‘>  

>>> type(set())  
<class ‘set‘>

Using Set Literal Syntax

You can also define sets using curly braces {} and separating elements by commas:

>>> primes = {2, 3, 5, 7}
>>> odds = {1, 3, 5, 7, 9}

However, this syntax cannot be used to create an empty set, as {} creates an empty dictionary. Always use set() for that.

Set Size and Membership

To get the number of elements in a set, use the len() function:

>>> len({1, 2, 3})
3

>>> len(set())
0

To check if a value is contained in a set, use the in keyword:

>>> 1 in {1, 2, 3}
True

>>> 4 in {1, 2, 3} 
False

The average time complexity of in is O(1), so membership tests are very efficient, especially for large sets.

Modifying Sets

Sets are mutable, you can add and remove elements after creation. Common mutating operations include:

Adding Elements

Use the add() method to add single elements to a set:

>>> s = {1, 2}
>>> s.add(3)
>>> s
{1, 2, 3}

>>> s.add(3)  # adding existing element does nothing
>>> s
{1, 2, 3}

To add multiple elements at once, use update() passing any iterable:

>>> s = {1, 2}  
>>> s.update([3, 4, 5])
>>> s
{1, 2, 3, 4, 5}

Removing Elements

There are three ways to remove elements from a set:

  1. remove(elem) – Removes elem from the set. Raises KeyError if elem not found.
  2. discard(elem) – Removes elem from the set if present, does nothing otherwise.
  3. pop() – Removes and returns an arbitrary set element. Raises KeyError if set is empty.

Here are some examples:

>>> s = {1, 2, 3}

>>> s.remove(2)
>>> s
{1, 3}

>>> s.remove(4)  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 4

>>> s.discard(3)  
>>> s
{1}

>>> s.discard(3)  # no error raised
>>> s
{1}

>>> s.pop() 
1
>>> s
set()

>>> s.pop()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: ‘pop from an empty set‘

To remove all elements from a set, use the clear() method:

>>> s = {1, 2, 3}
>>> s.clear()  
>>> s
set()

Set Operations

Python provides several operators and methods to perform common set operations:

Union

The union of two sets A and B is a new set containing all elements that are in either A, B or both.

There are two ways to perform union:

  • Use the | operator
  • Use the union() method

Here‘s an example:

>>> a = {1, 2, 3}
>>> b = {2, 3, 4}

>>> a | b
{1, 2, 3, 4}

>>> a.union(b)  
{1, 2, 3, 4}

>>> b.union(a)  
{1, 2, 3, 4}

The Venn diagram below shows the union of two sets A and B:

[Venn diagram of union of sets A and B]

Intersection

The intersection of two sets A and B is a new set containing only elements that are in both A and B.

There are two ways to perform intersection:

  • Use the & operator
  • Use the intersection() method

Here‘s an example:

>>> a = {1, 2, 3}
>>> b = {2, 3, 4}

>>> a & b
{2, 3} 

>>> a.intersection(b)
{2, 3}

>>> b.intersection(a)  
{2, 3}

The Venn diagram below shows the intersection of two sets A and B:

[Venn diagram of intersection of sets A and B]

Difference

The difference of two sets A and B is a new set containing elements that are in A but not in B.

There are two ways to perform set difference:

  • Use the – operator
  • Use the difference() method

Here‘s an example:

>>> a = {1, 2, 3}  
>>> b = {2, 3, 4}

>>> a - b
{1}

>>> a.difference(b)
{1}  

>>> b - a  
{4}

>>> b.difference(a)
{4}

The Venn diagrams below show the difference of sets A and B:

[Venn diagram of A – B] [Venn diagram of B – A]

Symmetric Difference

The symmetric difference of two sets A and B is a new set containing elements that are in either A or B, but not both.

There are two ways to find the symmetric difference:

  • Use the ^ operator
  • Use the symmetric_difference() method

Here‘s an example:

>>> a = {1, 2, 3}
>>> b = {2, 3, 4}  

>>> a ^ b
{1, 4}

>>> a.symmetric_difference(b)  
{1, 4}

>>> b ^ a
{1, 4}  

The Venn diagram below shows the symmetric difference of sets A and B:

[Venn diagram of symmetric difference of A and B]

Subset and Superset

A set A is considered a subset of B if all elements of A are also elements of B. Conversely, B is a superset of A if it contains all elements of A.

We can test for subsets and supersets using operators or methods:

  • a <= b – tests if a is a subset of b
  • a.issubset(b) – tests if a is a subset of b
  • b >= a – tests if b is a superset of a
  • b.issuperset(a) – tests if b is a superset of a

Here are some examples:

>>> a = {1, 2}
>>> b = {1, 2, 3}

>>> a <= b
True

>>> a.issubset(b)
True 

>>> b >= a
True

>>> b.issuperset(a)
True

>>> {1, 2, 3, 4} >= {1, 2, 3}
True

>>> {1, 2} >= {1, 2, 3}  
False

Disjoint Sets

Two sets are considered disjoint if they have no elements in common. We can test this using:

  • a.isdisjoint(b) – returns True if a and b are disjoint sets

For example:

>>> {1, 2, 3}.isdisjoint({4, 5})  
True

>>> {1, 2}.isdisjoint({1, 4}) 
False

The Venn diagram below shows disjoint sets:

[Venn diagram of disjoint sets]

Frozensets

Python provides an immutable version of sets called frozensets. They support the same operations as regular sets but cannot be modified once created.

To create a frozenset, use the frozenset() constructor:

>>> fs = frozenset([1, 2, 3])
>>> fs
frozenset({1, 2, 3})

>>> type(fs)
<class ‘frozenset‘>  

The main advantage of using frozensets is that they are hashable, so you can use them as dictionary keys or elements of other sets.

Use Cases

Here are a few practical examples of using sets in Python:

  1. Remove duplicates from a list:
>>> lst = [1, 2, 2, 3, 1]  
>>> list(set(lst))
[1, 2, 3]
  1. Find distinct elements that are in one list but not the other:
>>> lst1 = [1, 2, 3]
>>> lst2 = [2, 3, 4]
>>> set(lst1) - set(lst2)  
{1}
  1. Count number of distinct elements in a list:
>>> len(set([1, 2, 2, 3]))
3  
  1. Find elements that two lists have in common:
>>> set([1, 2, 3]) & set([2, 3, 4]) 
{2, 3}
  1. Implement a simple spam filter:
spam_keywords = {‘buy‘, ‘subscribe‘, ‘purchase‘}  

def is_spam(msg):
    return len(spam_keywords & set(msg.lower().split())) > 0

Conclusion

In this article, we took an in-depth look at Python sets, their properties and common use cases. To recap:

  • Sets are unordered collections of unique elements
  • You can create sets using {} or set() constructor
  • Sets support common operations like union, intersection, difference
  • Use frozenset for immutable sets that can be hashed
  • Sets are great for removing duplicates and performing math-like operations on collections

Next time you‘re working with collections of unique elements in Python, consider using a set to simplify and speed up your code. The possibilities are endless!

Similar Posts