Welcome back to yet another post in the How to Python series. This time I’m looking to step back a little bit to talk about one of Python’s builtin features called the list comprehension. While we’ve used them a few times in the series, I never thought to really explain them until now.
Table of Contents
Lately, I’ve been putting together videos for these articles. If you have some time, I recommend checking out this summary which covers all the topics from this article with even more examples. And of course, you get to see my beautiful face!
Unlike other articles in this series, there’s not exactly a concrete problem we’re trying to solve in this article. Instead, the goal is to understand the list comprehension syntax:
nums = [2, 6, 10, -4] negative_nums = [x for x in nums if x < 0]
What is this bizarre syntax, and how does it work? That’s the goal of the article today. In particular, we’ll look at a few scenarios where a list comprehension is useful such as:
- Duplicating a list
- Modifying a list
- Filtering a list
- Filtering and modifying a list
- Generate all pairs from two lists
- Duplicating nested lists
If you know of anything else we can do with a list comprehension, let me know!
Before we can dive into the solutions, let’s talk about the syntax a bit. Here’s my best attempt at illustrating the concept:
output = [expression(item) for item in some_list]
At the most basic level, we can construct a list comprehension that iterates over each item in some list, performs some expression on that item, and places that new item in an output list. Or as a loop:
output =  for item in some_list: output.append(expression(item))
Of course, we can do a lot more than just create a list from some other list with a list comprehension. In the following subsections, we’ll take a look at a few examples.
Duplicate a List
Perhaps the simplest use of a list comprehension is duplicating another list:
my_list = [2, 5, -4, 6] output = [item for item in my_list] # [2, 5, -4, 6]
In this case,
output will be equivalent to
my_list. For completeness, here’s the same solution as a loop:
my_list = [2, 5, -4, 6] output =  for item in my_list: output.append(item)
As we can see, the list comprehension is significantly more concise. In either case, we will only perform a shallow copy—meaning items in the new list may point to the same items in the old list—so it’s a good idea to only use this syntax for copying lists of immutable values like numbers.
Modify a List*
Now that we know how to duplicate a list, let’s try modifying the items before we add them to the output list:
my_list = [2, 5, -4, 6] output = [2 * item for item in my_list] # [4, 10, -8, 12]
Instead of copying the original list directly, we modified each item by multiplying it by two before storing it in the new list. As a result, we end up with a list where each term is twice as big as it was in the original list. Here’s the same concept using a loop:
my_list = [2, 5, -4, 6] output =  for item in my_list: output.append(item * 2)
To be clear, as the asterisk probably hints, we didn’t actually change the original list. Instead, we created a completely new list with the items doubled.
If my_list contained objects or some other mutable data type like a list, there would be nothing stopping us from modifying them. Of course, that’s considered bad practice, so I neglected to share an example on the off chance that someone haphazardly copies it into a production system.
Filter a List
While duplicating and modifying lists is fun, sometimes it’s helpful to be able to filter a list:
my_list = [2, 5, -4, 6] output = [item for item in my_list if item < 0] # [-4]
In this case, we’ve added a new expression to the rightmost portion of the list comprehension that reads:
if item < 0. Of course, the loop equivalent might look something like the following:
my_list = [2, 5, -4, 6] output =  for item in my_list: if item < 0: output.append(item)
In other words, for each item in the list, only consider it if it’s less than zero. If it is, dump it to the new list. As a result, we end up with a list that only contains negative values.
Filter and Modify a List
Naturally, we can both modify and filter a list at the same time by combining the syntax:
my_list = [2, 5, -4, 6] output = [2 * item for item in my_list if item < 0] # [-8]
In this case, we’ve decided to double all negative values before dumping the results to a list. Once again, the same syntax as a loop might look something like:
my_list = [2, 5, -4, 6] output =  for item in my_list: if item < 0: output.append(item * 2)
As a result, the output list only contains
-8. Once again, it’s important to mention that we didn’t actually modify the original list.
Generate All Pairs from Two Lists
Now, we’re starting to get into some of the more advanced features of list comprehensions. In particular, we’re looking to generate pairs of values between two lists:
# [(1, 2), (1, 4), (1, 6), (3, 2), (3, 4), (3, 6), (5, 2), (5, 4), (5, 6)] output = [(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]
Here, we’ve created a list that contains all combinations of pairs from two lists. As usual, we can implement the same thing with the following set of loops:
output =  for a in (1, 3, 5): for b in (2, 4, 6): output.append((a, b))
If we wanted to make things more interesting, we could apply some filtering:
# [(3, 2), (5, 2), (5, 4)] output = [(a, b) for a in (1, 3, 5) for b in (2, 4, 6) if a > b]
In this case, we only generate a pair if the number from the first list is larger than the number from the second list.
Duplicate Nested Lists
With the shallow copy example mentioned earlier, we’re not able to duplicate nested lists such as two-dimensional matrices. To do that, we can leverage nested list comprehensions:
my_list = [[1, 2], [3, 4]] output = [[item for item in sub_list] for sub_list in my_list] print(output) # Prints [[1, 2], [3, 4]]
Instead of performing a surface-level copy, we retrieve each list and copy them using the same comprehension from before. As you can probably imagine, we could abstract this concept into a recursive function which performs a list comprehension on every dimension of the matrix:
def deep_copy(to_copy): if type(to_copy) is list: return [deep_copy(item) for item in to_copy] else: return to_copy
How cool is that? Of course, if you have anything other than numbers or strings at the deepest levels of your matrix, you’ll have to handle the rest of the cloning process yourself.
A Little Recap
As always, here is a giant dump of all the examples covered in this article with comments briefly explaining each snippet. Feel free to grab what you need and go! If you’d like to play with any of these solutions, I’ve put them all in a Jupyter Notebook for your pleasure.
# Define a generic 1D list of constants my_list = [2, 5, -4, 6] # Duplicate a 1D list of constants [item for item in my_list] # Duplicate and scale a 1D list of constants [2 * item for item in my_list] # Duplicate and filter out non-negatives from 1D list of constants [item for item in my_list if item < 0] # Duplicate, filter, and scale a 1D list of constants [2 * item for item in my_list if item < 0] # Generate all possible pairs from two lists [(a, b) for a in (1, 3, 5) for b in (2, 4, 6)] # Redefine list of contents to be 2D my_list = [[1, 2], [3, 4]] # Duplicate a 2D list [[item for item in sub_list] for sub_list in my_list] # Duplicate an n-dimensional list def deep_copy(to_copy): if type(to_copy) is list: return [deep_copy(item) for item in to_copy] else: return to_copy
I hope you had as much fun reading through this article on list comprehensions as I did writing it. I think at this point in the series I’m going to start exploring basic concepts like this and stretching them to their limits. Do you have a Python concept that you’d like explored? Let me know!
In the meantime, why not check out some of these other awesome Python articles:
- Rock, Paper, Scissors Using Modular Arithmetic
- How to Check if a File Exists in Python
- How to Parse a Spreadsheet in Python
And, if you’re feeling extra generous, make your way over to the members page and take a look at your options. At any rate, thanks again for the support. Come back soon!
Indexing arrays is always a confusing topic. After all, the average person starts counting from one, so why don't arrays?
As someone who collects a lot of feedback, I sometimes underestimate how much work goes into processing, so let's try!