Today, I’m kicking off a new series of educational Python articles that focuses on reverse engineering common Python functions. To start the series, I figured we’d take a look at an implementation of an uppercase function similar to upper()
. Let’s see how we do!
Table of Contents
Problem Description
Recently, I wrote an article on how to capitalize a string in Python, and I had an idea. What if I put together a series of articles on implementing existing Python functionality? This would allow me to teach a bit of my thought process while also giving me an endless supply of articles to write, so I decided to give it a go.
To kick off this series, I thought it would be fun to explore a method closely related to capitalization: upper()
. If you’re not familiar with this method, here’s the official method description:
Return a copy of the string with all the cased characters converted to uppercase. Note that
s.upper().isupper()
might beFalse
ifs
contains uncased characters or if the Unicode category of the resulting character(s) is not “Lu” (Letter, uppercase), but e.g. “Lt” (Letter, titlecase).The uppercasing algorithm used is described in section 3.13 of the Unicode Standard.
Source: Python Documentation
Ultimately, the goal of today will be to write our own upper()
function in line with the description above. That said, like most of my work regarding strings, I try to simplify things considerably. Here’s the uppercase and lowercase character sets we’ll be working with today:
lowercase = "abcdefghijklmnopqrstuvwxyz" uppercase = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
Any function we develop today should then behave as follows:
>>> upper("example") EXAMPLE >>> upper("123abc") 123ABC >>> upper("HOWDY") HOWDY
In the next section, we’ll talk about the thought process I’d use to solve this problem.
Thought Process
For me, when it comes to writing up a solution, I like to think about the expected behavior in terms of a black box. In other words, we don’t really know how upper()
works, but we do know two things: input and expected output.
- Input: a string
- Output: a string with all cased characters converted to uppercase
Or if you’d like it in Python format, here’s what the function definition might look like in a file called roll_your_own.py
:
def upper(string): pass
Ultimately, we need to figure out how to transform the input into the expected output. In this case, the transformation probably involves finding all the lowercase letters and converting them to uppercase characters.
What else do we know? Well, we know strings cannot be modified, so we’ll need to build a new string to return. In addition, we know the transformation is not just going to be a process of converting lowercase letters to uppercase letters. We’ll also need to identify lowercase letters from other letters.
Based on this information, there’s probably going to be a few steps:
- Identify characters that need to be transformed
- Convert them
- Add them to a new string
- Return the result
Perhaps the most straightforward way to do this would be to scan each character in the string and add it to a new string. Of course, we don’t want to duplicate the string. As a result, if the current character is lowercase, convert it before adding it to the new string.
Testing
Now, there are a lot of ways to implement the solution we came up with and probably dozen of ways that use different steps. Regardless of the solution we come up with, we’ll want to make sure that it’s valid. To do that, we should write a few tests.
Personally, I’ve followed the same crude testing scheme since my first programming course in 2012: first, middle, last, zero, one, many. In our case, this simple testing scheme basically breaks down as follows:
- First: a lowercase character appears as the first character in the string
- Middle: a lowercase character appears somewhere in the middle of the string
- Last: a lowercase character appears as the last character in the string
- Zero: an empty string
- One: a string of one character
- Many: a string of many characters
Obviously, this list is not exhaustive, but it’s a great start.
For completeness, I’ll share how I’d write those tests as well. Assuming the example file from before (i.e. roll_your_own.py
), we can create a test file in the same folder called test.py
. The test file should like as follows:
import unittest import importlib roll_your_own = importlib.import_module("roll_your_own") class TestUpper(unittest.TestCase): def test_upper_first(self): self.assertEqual( roll_your_own.upper("aPPLE"), "APPLE", "Failed to uppercase 'a' in 'aPPLE'" ) def test_upper_middle(self): self.assertEqual( roll_your_own.upper("ApPLe"), "APPLE", "Failed to uppercase 'p' in 'ApPLE'" ) def test_upper_last(self): self.assertEqual( roll_your_own.upper("APPLe"), "APPLE", "Failed to uppercase 'e' in 'APPLe'" ) def test_upper_zero(self): self.assertEqual( roll_your_own.upper(""), "", "Failed to return empty string unchanged" ) def test_upper_one(self): self.assertEqual( roll_your_own.upper("a"), "A", "Failed to uppercase a single letter" ) def test_upper_many(self): self.assertEqual( roll_your_own.upper("how now brown cow"), "HOW NOW BROWN COW", "Failed to uppercase many letters" ) if __name__ == '__main__': unittest.main()
And to be sure the testing works, we should see something like the following when we run it:
FFFFFF ====================================================================== FAIL: test_upper_first (__main__.TestUpper) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\jerem\Downloads\test\test.py", line 9, in test_upper_first self.assertEqual(roll_your_own.upper("aPPLE"), "APPLE", "Failed to uppercase 'a' in 'aPPLE'") AssertionError: None != 'APPLE' : Failed to uppercase 'a' in 'aPPLE' ====================================================================== FAIL: test_upper_last (__main__.TestUpper) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\jerem\Downloads\test\test.py", line 15, in test_upper_last self.assertEqual(roll_your_own.upper("APPLe"), "APPLE", "Failed to uppercase 'e' in 'APPLe'") AssertionError: None != 'APPLE' : Failed to uppercase 'e' in 'APPLe' ====================================================================== FAIL: test_upper_many (__main__.TestUpper) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\jerem\Downloads\test\test.py", line 24, in test_upper_many self.assertEqual(roll_your_own.upper("how now brown cow"), "HOW NOW BROWN COW", "Failed to uppercase many letters") AssertionError: None != 'HOW NOW BROWN COW' : Failed to uppercase many letters ====================================================================== FAIL: test_upper_middle (__main__.TestUpper) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\jerem\Downloads\test\test.py", line 12, in test_upper_middle self.assertEqual(roll_your_own.upper("ApPLe"), "APPLE", "Failed to uppercase 'p' in 'ApPLE'") AssertionError: None != 'APPLE' : Failed to uppercase 'p' in 'ApPLE' ====================================================================== FAIL: test_upper_one (__main__.TestUpper) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\jerem\Downloads\test\test.py", line 21, in test_upper_one self.assertEqual(roll_your_own.upper("a"), "A", "Failed to uppercase a single letter") AssertionError: None != 'A' : Failed to uppercase a single letter ====================================================================== FAIL: test_upper_zero (__main__.TestUpper) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\jerem\Downloads\test\test.py", line 18, in test_upper_zero self.assertEqual(roll_your_own.upper(""), "", "Failed to return empty string unchanged") AssertionError: None != '' : Failed to return empty string unchanged ---------------------------------------------------------------------- Ran 6 tests in 0.013s FAILED (failures=6)
With that out of the way, let’s go ahead and write ourselves a solution!
Solution
As I mentioned above, my general approach to uppercasing a string will be as follows:
- Identify characters that need to be transformed
- Convert them
- Add them to a new string
- Return the result
Let’s tackle each step one at a time.
Identify Lowercase Characters
To identify lowercase characters, we’re going to need some sort of mechanism for retrieving each character. There are a couple of ways to do this, but they basically fall in two camps: recursion and iteration. In other words, we can get each character from our string using recursion or iteration. Here’s an example for each:
Iteration
def upper(string): result = "" for character in string: result += character return result
Recursion
def upper(string): if string: return string[0] + upper(string[1:]) return string
Both of these examples have the same behavior: they create a copy of the original string. It’s up to you to decide which approach you’ll take, but I’m fond of the iterative approach.
Now that we have a way of retrieving each character from the string, we need some way to check if it’s lowercase. If you read my capitalization article, then you know there are several ways to do this. Personally, I like using the ordinal values of each character to identify characters in the range of all lowercase values (i.e. 97 – 122). To do that, we need an if statement:
def upper(string): result = "" for character in string: if 97 <= ord(character) <= 122: pass result += character return result
Alternatively, it’s entirely possible to search a string that has all of the lowercase letters of the alphabet:
def upper(string): lowercase = 'abcdefghijklmnopqrstuvwxyz' result = "" for character in string: if character in lowercase: pass result += character return result
Personally, I think the string of characters is a bit ugly, but I’d argue the code is more readable due to the lack of magic numbers. That said, we’ll stick with the ordinal value solution for now.
Convert Lowercase Characters to Uppercase
Now that we’ve managed to identify all of the lowercase characters, we’ll need some conversion logic. Since we’re using the ordinal values, we’ll need some sort of mapping from lowercase to uppercase.
Luckily, all of the lowercase values can be found in the range of 97 to 122 while all of the uppercase values can be found in the range of 65 to 90. As it turns out, the difference in these ranges is 32. In other words, we can take the ordinal value of any lowercase letter and subtract it by 32 to obtain its uppercase counterpart. Here’s what that looks like in the code:
def upper(string): result = "" for character in string: if 97 <= ord(character) <= 122: uppercase = ord(character) - 32 result += character return result
And if you’re like me and hate to see duplicate code, you might pull out the call to ord()
:
def upper(string): result = "" for character in string: ordinal = ord(character) - 32 if 65 <= ordinal <= 90: pass result += character return result
Here, we compute the shift ahead of time and save it in a variable. If the shifted variable falls in the range of the uppercase letters, we know we had a lowercase letter. At this time, we don’t do anything with the value. That’s the next step!
Add Updated Characters to a New String
At this point, the bulk of the steps are complete. All that is left is to construct the new string. There are several ways to do this, but I’ll stick to the straightforward if/else:
def upper(string): result = "" for character in string: ordinal = ord(character) - 32 if 65 <= ordinal <= 90: result += chr(ordinal) else: result += character return result
Now, this solution technically works. For instance, here’s what happens when we run our tests:
...... ---------------------------------------------------------------------- Ran 6 tests in 0.012s OK
However, there are a few quality of life updates we should probably make. For example, it’s generally bad practice to concatenate strings in a loop. Instead, let’s try converting our string to a list, so we can leverage the join()
method:
def upper(string): characters = list(string) for index, character in enumerate(characters): ordinal = ord(character) - 32 if 65 <= ordinal <= 90: characters[index] = chr(ordinal) return ''.join(characters)
Personally, I like this solution a bit more because it allows us to modify the string in place. In addition, we got rid of a branch as well as concatenation in a loop.
That said, even after all this work, I think there’s another possible solution. Rather than iterating explicitly, what if we took advantage of one of the functional features of Python: map()
. That way, we could apply our conversion logic in a more concise way:
def upper(string): return "".join(map(lambda c: chr(ord(c) -32) if 97 <= ord(c) <= 122 else c, string))
Granted, a lot of Python folks prefer list comprehensions. That said, both are fairly unreadable given our ordinal logic, so it’s probably for the best to stick to the previous solution. Otherwise, I think we’re done here!
Why Not Roll Your Own?
The purpose of these roll your own articles is threefold:
First, they allow me to take some time to practice my Python, and it’s fun trying to reverse engineering common Python functions and methods.
Second, they allow me to demonstrate the thought process of an experienced programmer to newer programmers.
Finally, they give me yet another way for folks in the community to contribute. If you’d like to share your own solution to this problem, head on over to Twitter and share your solution with #RenegadePython. Alternatively, I’m happy to check out your solutions in our Discord
.
As always, I appreciate you taking the time to check out the site. If you’d like to help support The Renegade Coder, head on over to my list of ways to grow the site. Alternatively, feel free to check out some of these related articles:
- How to Convert an Integer to a String in Python: Type Casting and f-Strings
- How to Pick a Version of Python to Learn
Likewise, here are some resources from the folks at Amazon (#ad):
- Effective Python: 90 Specific Ways to Write Better Python
- Python Tricks: A Buffet of Awesome Python Features
- Python Programming: An Introduction to Computer Science
Once again, thanks for checking out the site! I hope to see you again soon.
Recent Code Posts
Recently, I was thinking about how there are so many ways to approach software design. While some of these approaches have fancy names, I'm not sure if anyone has really thought about them...
Poetry 2.x was released in early 2025, and we just got around to migrating several of our open-source projects to the new major version. As a result, I wanted to share some of the lessons learned.