The Self-Taught Guide to Type Systems in Python

The Self-Taught Guide to Type Systems in Python Featured Image

When it comes to learning Python, it’s really important that we come to grips with its type system. In this article, we’ll take a look at several type systems and determine which ones apply to Python. Then, we’ll finish out with an overview of some common data types.

Table of Contents

Type Systems in Programming

When it comes to programming, one very, very important concept is typing. No, I’m not talking about literal typing on the keyboard—though that is an important aspect of development. Instead, I’m talking about data typing. In other words, the set of values a variable take on.

In the real world, we’re comfortable with this idea of data types because it aligns nicely with our idea of categorization. For example, when I say the word “bird”, you probably imagine some sort of winged creature flying through the sky. In other words, we don’t have to imagine the same bird to come to a consensus on what a bird is. As it turns out, typing is a pretty similar concept: data has to fit into some sort of category.

Unfortunately, that’s sort of where the analogy fails as data in a computing system is really just a series of zeroes and ones. In order to categorize different patterns of bits, we have to introduce a typing system. That way, we can abstract patterns of bits a bit—pun absolutely intended.

As it turns out, there are a lot of ways to characterize a typing system. For example, some systems rely on the user to categorize their data explicitly (e.g. “I declare this pattern of bits a bird”) while other systems can infer categories (e.g. “This pattern of bits appears to be a bird”). In other words, explicit vs. implicit typing, respectively.

Similarly, some systems wait to verify categories until runtime (e.g. “Whoops! You tried to make a cat fly.”) while other systems check their categories before runtime (e.g. “Sorry, I won’t allow you to make a cat fly”). In other words, dynamic vs. static typing, respectively.

Finally, some systems allow data to be easily coerced into different categories (e.g. “This hat also makes a great bowl”) while other type systems are more strict (e.g. “This hat is definitely not a bowl”). In other words, weak vs. strong typing, respectively.

While these three pairs of type systems are not exhaustive, they form a nice foundation for our discussion around type systems. In the following subsections, we’ll break down each of these dichotomies.

Explicit vs. Implicit Typing

Perhaps the easiest dichotomy to explain is explicit vs. implicit typing. After all, these two systems have the largest visual impact on how code is written.

In an explicit typing system, data has to be labeled with its type. For example, if we want to store an integer, we have to label the variable with the appropriate type (the following is pseudocode):

integer x = 5

On the other hand, in an implicit typing system, data is not labeled. Instead, the compiler or interpreter will infer the data type from context (the following is pseudocode):

x = 5

In this case, it’s very clear that `x` stores an integer, so it’s no surprise that the type system can figure that out on our behalf.

In other examples, it can be less clear what type of value a variable holds. For instance, we might have some function that returns a value of some non-obvious type:

x = some_obscure_function()

To figure out what type of value `x` stores, we have to figure out what type of value our function returns. If that’s not clear, we have to keep digging through the code until we figure it out.

In contrast, explicit typing systems don’t have this problem. However, they do tend to have issues of verbosity where types have to be written out all over the place (see: Java).

Today, most modern programming languages try to deal with these issues by having a mix of both systems. For example, Python is predominantly an implicitly typed language. After all, we can declare an integer variable just like above:

x = 5

However, Python includes a type hinting feature for folks who want to label their data a bit better:

x: int = 5

Unfortunately, type hinting didn’t come around until Python 3.5 (PEP 484Opens in a new tab.). In fact, this exact syntax wasn’t supported until Python 3.6 (PEP 526Opens in a new tab.). That said, for folks moving from an explicitly typed system like Java, this is probably a breath of fresh air.

Regardless, despite what you’ll hear in the forums, there’s very little consequence in choosing either system. In general, it comes down to style as most modern development tools will handle some form of type tracking for you.

Dynamic vs. Static Typing

If explicit and implicit typing systems describe the way data is labeled, then dynamic and static typing systems describe the way data is processed.

In a dynamic typing system, data is not processed until runtime. In other words, if we were to expand on our cat example from before, dynamic typing would allow us to attempt to make a cat fly. That doesn’t mean that what we’re doing is valid; it just means that we wouldn’t see any errors until we ran the code.

One simple pseudocode example inolves trying to perform arithmetic with two variables of different types:

5 + "Hello"

Normally, this would be invalid, right? After all, what would we even expect this to do? Unfortunately, in a dynamic typing system, we won’t find our error until we run the code:

TYPE_ERROR: CAN'T ADD 5 TO "HELLO"

On the flip side, in a static typing system, data is processed at compile time. In other words, if a TYPE_ERROR were to come up, the compiler would stop before we could execute our code.

Naturally, static typing contrasts quite a bit with dynamic typing because static typing forces the developer to address all type issues before a program can execute. As a result, sometimes it’s easier to get something up and running with dynamic typing.

Another interesting way to contrast the two type systems is to think about the range of possible values a variable can take on. For example, in a static typing system, variables have to stick to whatever type they were originally defined. In other words, we would get a compilation error in the following code snippet:

x = 5
x = "Hi"
TYPE_ERROR: CAN'T CHANGE THE TYPE OF x

In a static typing system, we depend on `x` to hold its original type. Otherwise, the type becomes meaningless because we have no way of tracking `x` through the code without running it. As a result, any time we see `x`, we assume it holds whatever type it was originally assigned. If we attempt to reassign it a different type, the compiler will crash.

Meanwhile, in a dynamic typing system, we can redefine variables to our heart’s content. After all, since there’s no type checking at compile time, we can let a variable organically redefine itself over time. As long as it’s the appropriate type when we need it, we don’t care what it was up to. In other words, the code snippet above is totally valid.

With all that said, it’s probably a good time to mention that Python is a dynamically typed language—though it’s possible to develop a compiler for Python that could perform static typing. In other words, Python performs type checking at runtime, so variables can take on many forms throughout their life. For example, the following code snippet is totally legal:

x = 5
x = "Hi"

Unfortunately, this benefit comes at the cost of runtime type errors:

>>> 5 + "Hello"
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    5 + "Hello"
TypeError: unsupported operand type(s) for +: 'int' and 'str'

While nothing is stopping us from executing code with bad types, the interpreter will ultimately throw an error. After all, what would we expect the interpreter to do in this case? Funny you should ask: some languages actually support these kinds of operations. In the next section, we’ll look at some examples.

Weak vs. Strong Typing

One of the last ways we can divide type systems is by weak vs. strong. Unfortunately, of all the dichotomies, this is perhaps the least defined. In fact, I don’t believe there is a universal definition for either of these terms. That said, I’ll try to do my best to give them a working definition for this article.

Typically, a weak type system refers to the ability to allow types to be implicitly coerced into different types. As mentioned before, one way we can think about this is through the multiplicity of everyday objects. For example, I mentioned that a hat might also be used as a bowl, like the famous 10-gallon hat.

Of course, part of me thinks that this hat/bowl combo is a really silly example, yet I also think it serves the idea of weak type systems well. After all, in a weak type system, it’s possible for data to assume a form that it doesn’t really fit. This can lead to all kinds of nasty bugs which is why a lot of languages avoid extreme cases of weak type systems such as the ones in C and PHP.

That said, in a weak type system, data can be naturally coerced into other values. For example, if we tried to add text and an integer like before, we could expect one of those variables to take the form of the other—which form depends on how the language rules are implemented. In other words, it’s possible the following happens (in pseudocode):

>>> 5 + "7"
"57"

In this case, 5 is naturally converted into text where it is then added to “7”. On the other hand, we might see “7” converted into an integer and added to 5 (in pseudocode):

>>> 5 + "7"
12

On the other end of the spectrum, we have the strong type system which doesn’t allow a type to be coerced into another type. Languages that adopt this type of system usually throw errors when types are mixed. For example, adding text to a number will result an the same TYPE_ERROR from above (in pseudocode):

5 + "7"
TYPE_ERROR: CANNOT ADD 5 to "7"

Unfortunately, because these definitions are so ambiguous, it’s hard to really categorize a type system as strong or weak. For example, Java allows for almost anything to be “added” to text by automatically converting that thing to text as well. Does that make Java a weakly typed language? I don’t think so.

Likewise, I would definitely consider Python a strongly typed language based on the example we’ve discussed already. After all, in order to combine a number with some text in Python, one of the values has to be explicitly converted—no implicit coercion.

However, there are places where Python is a bit more flexible. For example, some values can evaluate to `False` in certain contexts. These values are called falsy, and they include values like `0`, `””`, `[]`, and more. Naturally, all other values are considered `True`.

That said, most arguments I’ve seen state that Python is strongly typed. After all, just because some values are interpreted as true/false doesn’t mean those values change type in the process.

Overall, I’d say not to worry too much about this designation as it doesn’t offer a ton of value in the discussion around types. That said, in the next section, we’ll do a quick recap of Python’s type system before discussing what this means going forward.

Python’s Type System

Now that we’ve had a chance to discuss type systems a bit, let’s revisit Python’s type system. Specifically, Python falls into the following three typing distinctions:

  • Implicit
  • Dynamic
  • Strong

In other words, Python types do not have to be labeled, are only evaluated at runtime, and cannot be implicitly coerced.

As a result, we end up with a language that has concise code because types are inferred. To its detriment, however, this can make it harder to track types in the code.

Likewise, we end up with a language that let’s variables be a bit more fluid: taking on different forms at different times. Unfortunately, this can also make it harder to track types in the code.

As a consequence, critics of Python argue that it lends itself to smaller projects. In other words, as a project grows, it becomes harder and harder to maintain the code.

Of course, if you’re a beginner, it can be hard to judge that criticism. After all, we haven’t really seen a lot of code, and the examples we have seen lack the complexity to get the point across. So, we’ll take the rest of this article to look at some of the common data types in Python.

Common Python Data Types

Before we dig in, I want to mention that the purpose of this section is to give you a quick overview of the types of data that you can expect to see in a Python program. Since we haven’t had the chance to write a lot of code yet, some of these data types aren’t going to make a lot of sense. That’s okay! We’ll have plenty of time to talk about the different data types in more detail.

With that said, let’s kick things off with some numbers.

Integers

One of the data types that we’ve already been exposed to in this series is the integer. To recap, an integer is any whole number or its negative equivalent (e.g. -2, -1, 0, 1, 2). We can represent these types of values directly in Python:

>>> 5
5

One interesting feature of integers in Python is that they’re unbounded. In other words, there’s no limit on the size of an integer. If you’re familiar with other languages, this might come as a shock. After all, it’s common for integers to be represented in one of two forms: 32-bit or 64-bit. As a result, they typically have an upper and lower bound to how large they can be.

Another interesting feature of integers in Python is that they can be combined in all sorts of mathematical expressions. For example, it’s possible to add two integers together using the addition operator (`+`). Likewise, it’s possible to subtract, multiply, and divide integers as well:

>>> 2 + 3
5
>>> 7 - 1
6
>>> 8 * 4
32
>>> 9 / 3
3

In the next article, we’ll take a much deeper look at these mathematical expressions as well as other operators. Likewise, we’ll also talk about the related type, float, which can be used represent decimal values. For now, let’s move on to another data type we’ve seen a lot in this series.

Strings

Another common data type in Python is the string which is used to represent text. For example, when we printed “Hello, World” to the user in the past, we used to a string:

>>> "Hello, World"
'Hello, World'

Of course, Python is a bit weird in that it lets us define a string using either single or double quotes:

>>> 'Hello, World'
'Hello, World'

Honestly, I don’t really have a recommendation on which set of quotes to use. As someone who came from a Java background, I’m a bit partial to double quotes. That said, there doesn’t seem to be any hard or fast rules around them.

At any rate, strings are definitely one of the most versatile data types, so we’ll probably find ourselves using them a bit in this series.

Lists

The last data type I want to talk about today is the list. Typically, computer science education tends to avoid talking about lists (or rather, arrays) for as long as possible. I think part of that is the complexity of the data structure, but I also think students tend to force one into every solution as soon as they learn about it.

That said, I’ll go against my better judgement to introduce one last common data type: the list. As the name implies, a list is a collection of items like a shopping list. In Python, they can be created as follows:

x = []

Of course, if we want the list to store anything, we have to populate it:

x = ["cheese", "egg", "milk", "bread"]

Naturally, once we have the list, we can do a ton of fun things with it like searching and sorting. Of course, for our purposes right now, we’ll just stick to creating them.

In the meantime, I recommend taking some time to explore these data types. As we start to write our own code, we’ll likely encounter these throughout. For now, let’s go ahead and wrap things up!

Follow Your Types

Now that we’ve had a chance to talk about Python’s type system and look at a few of the data types in action, I want to leave you with one piece of advice: follow your types.

When it comes to development, the most important thing you can do is make sure that your data is in the form you expect it to be. This was excellent advice I was given when I was learning Java, and Java has a type checker built right into the compiler. I think this advice is even more important for folks trying to learn Python.

As we continue in this series, keep this mantra in the back of your mind. It’ll really help you track down and prevent bugs.

In the meantime, I’d appreciate it if you took some time to show this series some love. Feel free to give this article a share. Even better, head on over to my list of ways to grow the site and find something that works for you. I recommend the newsletter. It’s pretty low commitment, and you’ll always have something new to read at the beginning of each month.

While you’re here, why not take a moment to browse some of these related Python articles?

In addition, here are some resources from the folks at Amazon (ad):

Otherwise, I appreciate your time, and I hope I’ll see you next time!

The Autodidact's Guide to Python (11 Articles)—Series Navigation

One of my friends has decided to start learning Python, so I decided to create a series for him. This series is a dump of everything I’ve learned about Python since 2017. As someone who taught myself Python, I figured this would appeal to folks like me.

Jeremy Grifski

Jeremy grew up in a small town where he enjoyed playing soccer and video games, practicing taekwondo, and trading Pokémon cards. Once out of the nest, he pursued a Bachelors in Computer Engineering with a minor in Game Design. After college, he spent about two years writing software for a major engineering company. Then, he earned a master's in Computer Science and Engineering. Today, he pursues a PhD in Engineering Education in order to ultimately land a teaching gig. In his spare time, Jeremy enjoys spending time with his wife, playing Overwatch and Phantasy Star Online 2, practicing trombone, watching Penguins hockey, and traveling the world.

Recent Posts