Earlier in this series, I wrote a couple of articles on how to sort different types of lists in Python. For instance, I wrote one article on how to sort a list of strings. Then, later I wrote an article on how to sort a list of dictionaries. In both of those article, I used a few elegant solutions that are afforded to use by the Python standard library. Of course, what if we want to write our own sorting algorithm? That’s our topic for today!
As it turns out, there are a lot of ways to write your own brute force sorting algorithm in Python. For instance, you could try to implement selection sort, bubble sort, or insertion sort. For fun, you might even roll your own bogosort. In this article, we’ll take a look at solutions for all four algorithms.
Table of Contents
Problem Description
If you’ve ever taken a data structures or algorithms course, you’re probably familiar with the different ways we can store and manage data in a program. For example, we might store information in a list because we want to be able to access it at random quickly. Alternatively, we might opt for a dictionary because we want a quick way to lookup values.
Whatever data structure we choose, there are various ways we can interact with it. For example, a stack usually has push and pop operations. Meanwhile, a list might have insert and remove operations.
In this article, we’ll be taking a look at the Python list which can function as a lot of different data structures (e.g. stacks, queues, etc.). For our purposes, we’ll be treating it like an array of integers:
my_list = [4, -7, 5, 4] my_sorted_list = [-7, 4, 4, 5]
Now, the question is: what can we do with a list of integers? Well, we could try summing them up. Alternatively, we might look for the mean, median, and mode. That said, you’re not here to do any of that. You want to know how to sort this thing.
That said, sorting can mean a lot of different things depending on the context. Of course, as my buddy Robert said:
To sort means to flip the proverbial bird at entropy
In other words, the goal of sorting is to take the chaos of some list and organize it in some specific order. For example, if we sort this list of integers, we could organize the values in ascending or descending order. Luckily, most of the algorithms we’ll be looking at in this article will work for any sortable data like strings and characters.
Specifically, our goal will be to write a few list sorting algorithms by hand. In other words, we won’t be using any of the straightforward solutions outlined in the previous articles. Instead, we’ll write our own loops to implement some of the common poor-performing algorithms like bubble sort, insertion sort, and selection sort (i.e O(N2)). After all, each of these poor-performing algorithms work on the basis of brute force: sort one element per pass.
For now, we won’t bother talking about Big O notation, but if you’re interested in that sort of thing, I wrote an article about it ages ago.
Solutions
As I mentioned already, we’ll take a look at three typical brute force sorting algorithms: bubble sort, insertion sort, and selection sort. Of course, we won’t leave here without at least one fun sorting algorithm (hint: it’s bogo sort).
Sort a List With Bubble Sort
If you’re not familiar with bubble sort, we’ve written about the algorithm for the Sample Programs repo. To summarize, bubble sort is an algorithm which relies on swapping consecutive pairs of elements. As a result, large values tend to “bubble” up to the top of the list. To see this algorithm in action, check out the following video:
At any rate, here’s a simple Python implementation of bubble sort:
my_list = [4, -7, 5, 4] is_sorted = False while not is_sorted: is_sorted = True for i in range(len(my_list) - 1): if my_list[i] > my_list[i + 1]: my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i] is_sorted = False
I wrote this algorithm based on the pseudocode provided in Dr. Shun Yan Cheung’s bubble sort notes. Essentially, it works by continually swapping pairs of consecutive elements that are out of order until there are no more swaps to be made. For example, on the first pass, we end up with the following change:
[4, -7, 5, 4] # Initial list [-7, 4, 4, 5] # After the initial iteration
Interestingly, we actually end up with a sorted list after the first pass in this case. Of course, that is almost never the case. For example, if we change the list as follows:
[5, 4, 3, 2, 1]
We will only see the 5 move on the first pass:
[5, 4, 3, 2, 1] # Initial list [4, 3, 2, 1, 5] # After the first iteration
In other words, we end up with our worst nightmare: a list that’s in reverse sorted order.
At any rate, the portion of the code that performs each swap is the inner loop:
for i in range(len(my_list) - 1): if my_list[i] > my_list[i + 1]: my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i] is_sorted = False
Meanwhile, the code that detects if the list is sorted is the outer loop:
is_sorted = False while not is_sorted: is_sorted = True
Of course, the actual mechanism that tells us if the list is not sorted is the line is_sorted = False
in the inner loop. If there aren’t any swaps needed for a pass of the list, the is_sorted
variable stays true. In other words, we’re done!
As you can probably imagine, there are some minor optimizations we can make with this algorithm. For example, we know that each pass moves the current largest element to the end of the list. As a result, we could reduce our number of checks by “shrinking” our list by one each iteration. Of course, I’ll leave that exercise up to you.
Sort a List With Insertion Sort
If bubble sort isn’t your style, perhaps you might like to try insertion sort. Once again, I won’t go into too much detail on this algorithm because we’ve written about it for the Sample Programs repo. That said, the basic idea behind insertion sort is to treat a subset of the list as sorted and increasing grow that collection by inserting elements in it from the unsorted set—or visually:
In terms of implementation, we can write the insertion sort algorithm as follows:
my_list = [4, -7, 5, 4] for i in range(1, len(my_list)): to_swap = my_list[i] j = i - 1 while j >= 0 and my_list[j] > to_swap: my_list[j + 1] = my_list[j] j -= 1 my_list[j + 1] = to_swap
Once again, this solution was borrowed from the pseudocode on Algorithmist. It works by starting at the first index (i.e. i = 1
) and comparing that element with the element at the zeroth index (i.e. j < 1
). If a swap is needed, the items are swapped. In this case, the second item is smaller than the first, so we end up with the following change:
[4, -7, 5, 4] # Initial list [-7, 4, 5, 4] # After the first iteration
Next, the algorithm moves to the second index (i.e. i = 2
) and begins working backward (i.e. j < 2
) to find where that item fits in the first two items. In this case, 5 is already larger than 4, so we don’t need to perform any swaps:
[4, -7, 5, 4] # Initial list [-7, 4, 5, 4] # After the first iteration [-7, 4, 5, 4] # After the second iteration
Finally, the outer loop moves to the final element (i.e i = 3
) and begins scanning through the sorted portion of the list (i.e. j < 3
) to find where the current item goes. In this case, we only need to check as far back as the first index to figure out where 4 goes. As a result, we’re finished:
[4, -7, 5, 4] # Initial list [-7, 4, 5, 4] # After the first iteration [-7, 4, 5, 4] # After the second iteration [-7, 4, 4, 5] # After the third iteration
One thing to note is that swaps occur as we work backward through the sorted list. For example, in the last iteration, we found out that 5 was bigger than 4. At that point, we were able to move 5 into the last position. The portion of the code that handles swapping is the inner loop:
while j >= 0 and my_list[j] > to_swap: my_list[j + 1] = my_list[j] j -= 1
Meanwhile, the outer loop tracks the point that divides the sorted portion of the list from the unsorted portion and performs insertion:
for i in range(1, len(my_list)): to_swap = my_list[i] j = i - 1 # Inner loop my_list[j + 1] = to_swap
As you can probably imagine, there are more pythonic ways to write this solution. For example, Haseeb Majid chose to split the list in half and reassemble it with the latest item inserted in the correct place. If you know of any better solutions, feel free to share them in the comments.
Sort a List With Selection Sort
Now that we’ve seen insertion sort, it’s not too much of a stretch to begin talking about selection sort. After all, the algorithm is quite similar. However, instead of inserting an item in a sorted sublist, we seek out the smallest item from the unsorted sublist and add it to the end of the sorted sublist. For more information, check out the description of selection sort in the Sample Programs repo. Otherwise, here’s a nice visualization:
In terms of actual code, here’s a potential solution in Python:
my_list = [4, -7, 5, 4] for i in range(len(my_list)): min_index = i for j in range(i + 1, len(my_list)): if my_list[j] < my_list[min_index]: min_index = j my_list[i], my_list[min_index] = my_list[min_index], my_list[i]
As usual, I based this solution off a solution written in C on the selection sort Wikipedia page. It works by starting from the first element in the list (i.e. i = 0
) and searching for the smallest element in the list (i.e. j > 0
). After a complete pass, we know we’ve found the smallest element (min_index = 1
), so we can perform our swap. On the first pass, we end up with the following change:
[4, -7, 5, 4] # Initial list [-7, 4, 5, 4] # After first iteration
Then, we move our main pointer (i.e. i = 1
) and begin searching the unsorted portion of the list (i.e. j > 1
) for the smallest value. On the second pass, we end up with the following change:
[4, -7, 5, 4] # Initial list [-7, 4, 5, 4] # After the first iteration [-7, 4, 5, 4] # After the second iteration
In this case nothing changes because 4 is in the correct position. Then, on the next iteration (i.e. i = 2
), we search the unsorted portion of the list (i.e j > 2
) for the smallest remaining value. In this case, it’s the other 4:
[4, -7, 5, 4] # Initial list [-7, 4, 5, 4] # After the first iteration [-7, 4, 5, 4] # After the second iteration [-7, 4, 4, 5] # After the third iteration
At this point, the list is sorted.
Naturally, the portion of the code responsible for performing the search is the inner loop:
for j in range(i + 1, len(my_list)): if my_list[j] < my_list[min_index]: min_index = j
Meanwhile, the portion of code responsible for tracking the end of the sorted list and performing the swap is the outer loop:
for i in range(len(my_list)): min_index = i # Inner loop my_list[i], my_list[min_index] = my_list[min_index], my_list[i]
Again, I’m sure there are more clever ways to write this solution using Python. For example, we could use a two-list approach (as Haseeb did) which allows us to use the min
, append
, and remove
functions. In other words, no explicit loops. If you know of other clever ways to implement selection sort, let me know in the comments.
Sort a List With Bogosort
Now that we’ve gone through the three main brute force sorting algorithms, I figured we could look at another brute force method: bogosort. Rather than continually placing one element in the correct place on each pass, we’ll just move the elements at random until we sort the list. Here’s what that might look like in Python:
my_list = [4, -7, 5, 4] import random is_sorted = False while not is_sorted: random.shuffle(my_list) last_item = my_list[0] is_sorted = True for item in my_list: if last_item > item: is_sorted = False last_item = item
Here, we leverage a helpful package called random
which has a utility for shuffling lists. To start, we shuffle the list assuming the list isn’t already sorted. Then, we check to see if the list is sorted. If so, we’re done. Otherwise, we repeat the cycle.
To see this in action, let’s look at what could happen. First, we’ll shuffle the list:
[4, -7, 5, 4] # Initial list [5, 4, 4, -7] # After first iteration
As we can see, the list isn’t sorted. We’ll confirm that by checking each pair of values in sequential order. If we don’t see any pairs out of order, we stop. However, in this case, 5 is greater than 4, so we know the list isn’t sorted. As a result, we shuffle again:
[4, -7, 5, 4] # Initial list [5, 4, 4, -7] # After first iteration [-7, 4, 5, 4] # After second iteration
As we can imagine, this process could go on for a long time. Here’s an actual sequence of permutations I got when I ran the solution above:
[5, 4, 4, -7] [-7, 4, 5, 4] [5, 4, -7, 4] [4, 4, -7, 5] [4, 5, 4, -7] [4, 5, 4, -7] [4, 5, -7, 4] [4, 5, 4, -7] [-7, 4, 4, 5]
Now, that’s just for four elements. Imagine how long this could take with even more elements. Or, better yet, don’t imagine it at all. Here’s a visualization of the algorithm failing repeatedly for 100 elements:
Fortunately, there is a slight improvement that can be made to this algorithm. Instead of generating states at random, we could keep track of states we’ve already made and only generate new states. That way, we wouldn’t waste time generating repeated states.
Unfortunately, the deterministic version of bogosort is still very, very bad. Specifically, the algorithm is O(N!). In our four item case, we’d have a worst case runtime of checking 4! (24) states. Meanwhile, all of the algorithms mentioned thus far operate at O(N2) which means at worst 16 comparisons. As you can probably imagine, this is bad news for bogosort in the long term:
N | O(N2) Comparisons | O(N!) Comparisons |
---|---|---|
4 | 16 | 24 |
5 | 25 | 120 |
6 | 36 | 720 |
7 | 49 | 5040 |
8 | 64 | 40320 |
For fun, we’ll take a look at the performance of these algorithms in the next section.
Performance
To test each solution, we’ll need to build up some strings:
setup = """ import random size = 4 max = 30 """ bubble_sort = """ my_list = random.sample(range(max), size) is_sorted = False while not is_sorted: is_sorted = True for i in range(len(my_list) - 1): if my_list[i] > my_list[i + 1]: my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i] is_sorted = False """ insertion_sort = """ my_list = random.sample(range(max), size) for i in range(1, len(my_list)): to_swap = my_list[i] j = i - 1 while j >= 0 and my_list[j] > to_swap: my_list[j + 1] = my_list[j] j -= 1 my_list[j + 1] = to_swap """ selection_sort = """ my_list = random.sample(range(max), size) for i in range(len(my_list)): min_index = i for j in range(i + 1, len(my_list)): if my_list[j] < my_list[min_index]: min_index = j my_list[i], my_list[min_index] = my_list[min_index], my_list[i] """ bogo_sort = """ my_list = random.sample(range(max), size) is_sorted = False while not is_sorted: random.shuffle(my_list) last_item = my_list[0] is_sorted = True for item in my_list: if last_item > item: is_sorted = False last_item = item """
For this test, I introduced random list generation, so we could get more consistent testing. Unfortunately, the random sampling does add to the test time. However, since it’s the same line of code for all the snippets, I suspect it only adds an overhead.
At any rate, to actually test these snippets, we just need to invoke timeit
:
>>> import timeit >>> min(timeit.repeat(setup=setup, stmt=bubble_sort)) 9.461616800001138 >>> min(timeit.repeat(setup=setup, stmt=insertion_sort)) 7.850697500000024 >>> min(timeit.repeat(setup=setup, stmt=selection_sort)) 9.171850900000209 >>> min(timeit.repeat(setup=setup, stmt=bogo_sort)) 92.38232779999998
As you can probably imagine, I waited a concerning amount of time for that bogosort test to finish. Beyond that, I was most surprised by the performance of the selection sort algorithm. As it turns out, insertion sort generally performs less swaps than bubble sort and less comparisons than selection sort.
If you’re interested in seeing how these solutions scale, I modified the size parameter just for you. However, I didn’t retest bogosort:
>>> setup = """ import random size = 10 max = 30 """ >>> min(timeit.repeat(setup=setup, stmt=bubble_sort)) 29.55873109999993 >>> min(timeit.repeat(setup=setup, stmt=insertion_sort)) 20.157115599999088 >>> min(timeit.repeat(setup=setup, stmt=selection_sort)) 23.557934999998906
Here, we can see that selection sort is beginning to overtake bubble sort. However, it’s still not quite as fast as insertion sort. Naturally, I took to Google to find out exactly why this discrepancy exists. Thankfully, Stack Overflow user Cody Gray has a comprehensive answer. In short, they claimed that these discrepancies are expected. In fact, insertion sort is expected to outperform selection sort which is expected to outperform bubble sort. How cool is that?!
At any rate, I recommend taking this measurements with a grain of salt. For context, I tested each solution using Python 3.7.3 on a Windows machine. In other words, your results may vary. If you’re interested in learning more about this performance testing process, I have an article for that.
Challenge
If you liked learning about the different brute force sorting algorithms, I have a challenge for you:
Rewrite one of the brute force sorting algorithms (e.g. bubble sort, selection sort, insertion sort, or bogo sort) for your favorite data type (e.g. string, float, tuple, etc.).
There are a ton of different data types out there that you might be interested in sorting. For instance, maybe you want to alphabetize a list of names. Perhaps you have a list of address, and you want to sort them by distance from you.
Whatever data type you choose, find a way to rewrite the existing algorithms to accommodate them. As always, I’ll come up with a solution for my favorite data type, and I’ll share it below in the comments. I recommend you do the same!
A Little Recap
As always, let’s take a look at all of our solutions in one place:
my_list = random.sample(range(max), size) def bubble_sort(my_list): is_sorted = False while not is_sorted: is_sorted = True for i in range(len(my_list) - 1): if my_list[i] > my_list[i + 1]: my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i] is_sorted = False def insertion_sort(my_list): for i in range(1, len(my_list)): to_swap = my_list[i] j = i - 1 while j >= 0 and my_list[j] > to_swap: my_list[j + 1] = my_list[j] j -= 1 my_list[j + 1] = to_swap def selection_sort(my_list): for i in range(len(my_list)): min_index = i for j in range(i + 1, len(my_list)): if my_list[j] < my_list[min_index]: min_index = j my_list[i], my_list[min_index] = my_list[min_index], my_list[i] def bogosort(my_list): is_sorted = False while not is_sorted: random.shuffle(my_list) last_item = my_list[0] is_sorted = True for item in my_list: if last_item > item: is_sorted = False last_item = item
This time around, I decided to wrap the solutions in functions, so you could snag the code for yourself. Let me know if that’s helpful.
With all that said, that’s all I’ve got. If you enjoyed this article, and you’d like to help this site grow, check out my list of ways you can support The Renegade Coder. Alternatively, check out some of these Python books on Amazon (ad):
While you’re here, you might enjoy some of these articles as well:
Otherwise, thanks for taking some time to check out my site. I appreciate it!
Recent Posts
Recently, I was thinking about the old Pavlov's dog story and how we hardly treat our students any different. While reflecting on this idea, I decided to write the whole thing up for others to read....
In the world of programming languages, expressions are an interesting concept that folks tend to implicitly understand but might not be able to define. As a result, I figured I'd take a crack at...