How to Check if a String Contains a Substring in Python: In, Index, and More

How to Check if a String Contains a Substring in Python Featured Image

One concept that threw me for a loop when I first picked up Python was checking if a string contains a substring. After all, in my first language, Java, the task involved calling a method like `indexOf()`python or `contains()`python. Luckily, Python has an even cleaner syntax, and we’ll cover that today.

To summarize, we can check if a string contains a substring using the `in`python keyword. For example, `”Hi” in “Hi, John”`python returns true. That said, there are several other ways to solve this problem including using methods like `index()`python and `find()`python. Check out the rest of the article for more details.

Table of Contents

Problem Description

A common problem in programming is detecting if a string is a substring of another string. For example, we might have a list of addresses stored as strings (that we might even sort), and we want to find all addresses on a certain street (e.g. Elm Street):

addresses = [
    "123 Elm Street",
    "531 Oak Street",
    "678 Maple Street"
]
street = "Elm Street"

In that case, we might check which addresses contain the street name (e.g. 123 Elm Street). How do we do something like this in Python?

In most programming languages, there’s usually some substring method. For instance, in Java, strings have an `indexOf()`java method which returns a positive number if the substring was found.

Even without a special method, most languages allow you to index strings like arrays—just be careful of the IndexErrors as usual. As a result, it’s possible to manually verify that a string contains a substring by looking for a match directly.

In the following section, we’ll take a look at several possible solutions in Python.

Solutions

As always, I like to share a few possible solutions to this problem. That said, if you want the best solution, I suggest jumping to the last solution.

Checking if String Contains Substring by Brute Force

Whenever I try to solve a problem like this, I like to think about the underlying structure of the problem. In this case, we have a string which is really a list of characters. As a result, what’s stopping us from iterating over those character to find our substring:

addresses = [
    "123 Elm Street",
    "531 Oak Street",
    "678 Maple Street"
]
street = "Elm Street"

for address in addresses:
    address_length = len(address)
    street_length = len(street)
    for index in range(address_length - street_length + 1):
        substring = address[index:street_length + index]
        if substring == street:
            print(address)

Here, I’ve written a sort of nasty set of loops which iterate over all addresses, compute lengths of some strings, iterate over all substrings of the appropriate size, and prints the results if a proper substring is found.

Luckily, we don’t have to write our own solution to this. In fact, the entire inner loop is already implemented as a part of strings. In the next section, we’ll look at one of those methods.

Checking if String Contains Substring Using `index()`python

If we want want to check if a string contains a substring in Python, we might try borrowing some code from a language like Java. As mentioned previously, we usually use the `indexOf()`java method which returns an index of the substring. In Python, there’s a similar method called `index()`python:

addresses = [
    "123 Elm Street",
    "531 Oak Street",
    "678 Maple Street"
]
street = "Elm Street"

for address in addresses:
    try:
        address.index(street)
        print(address)
    except ValueError:
        pass

Here, we call the index function without storing the result. After all, we don’t actually care what the index is. If the method doesn’t find a matching substring, it’ll throw an exception. Naturally, we can catch that exception and move on. Otherwise, we print out the address.

While this solution gets the job done, there’s actually a slightly cleaner solution, and we’ll take a look at it in the next section.

Checking if String Contains Substring Using `find()`python

Interestingly enough, Python has another method similar to `index()`python which functions almost identically to `indexOf()`java from Java. It’s called `find()`python, and it allows us to simplify our code a little bit:

addresses = [
    "123 Elm Street",
    "531 Oak Street",
    "678 Maple Street"
]
street = "Elm Street"

for address in addresses:
    if address.find(street) >= 0:
        print(address)

Now, that’s a solution I can get behind. After all, it’s quite reminscent of a similar Java solution.

Again, it works like `index()`python. However, instead of throwing an exception if the substring doesn’t exist, it returns -1. As a result, we can reduce our try/except block to a single if statement.

That said, Python has an even better solution which we’ll check out in the next section.

Checking if String Contains Substring Using in Keyword

One of the cool things about Python is how clean and readable the code can be—even when we intentionally obfuscate it. Naturally, this applies 5when checking if a string contains a substring. Instead of a fancy method, Python has the syntax built-in with the `in`python keyword:

addresses = [
    "123 Elm Street",
    "531 Oak Street",
    "678 Maple Street"
]
street = "Elm Street"

for address in addresses:
    if street in address:
        print(address)

Here, we use the `in`python keyword twice: once to iterate over all the addresses in the address list and again to check if the address contains the street name. As you can see, the `in`python keyword has two purposes:

  • To check if a value is present in a sequence like lists and strings
  • To iterate through a sequence

Of course, to someone coming from a language like Java, this can be a pretty annoying answer. After all, our intuition is to use a method here, so it takes some getting used to. That said, I really like how this reads. As we’ll see later, this is also the fastest solution.

Performance

With all these solutions ready to go, let’s take a look at how they compare. To start, we’ll need to set the solutions up in strings:

setup = """
addresses = [
    "123 Elm Street",
    "531 Oak Street",
    "678 Maple Street"
]
street = "Elm Street"
"""

brute_force = """
for address in addresses:
    address_length = len(address)
    street_length = len(street)
    for index in range(address_length - street_length + 1):
        substring = address[index:street_length + index]
        if substring == street:
            pass # I don't want to print during testing
"""

index_of = """
for address in addresses:
    try:
        address.index(street)
        # Again, I don't actually want to print during testing
    except ValueError:
        pass
"""

find = """
for address in addresses:
    if address.find(street) >= 0:
        pass # Likewise, nothing to see here
"""

in_keyword = """
for address in addresses:
    if street in address:
        pass # Same issue as above
"""

With these strings ready to go, we can begin testing:

>>>> import timeit
>>> min(timeit.repeat(setup=setup, stmt=brute_force))
4.427290499999998
>>> min(timeit.repeat(setup=setup, stmt=index_of))
1.293616
>>> min(timeit.repeat(setup=setup, stmt=find))
0.693925500000006
>>> min(timeit.repeat(setup=setup, stmt=in_keyword))
0.2180926999999997

Now, those are some convincing results! As it turns out, brute force is quite slow. In addition, it looks like the error handling of the `index()`python solution isn’t much better. Luckily, `find()`python exists to eliminate some of that overhead. That said, `in`python is the fastest solution by far.

As is often the case in Python, you’ll get the best performance out of common idioms. In this case, don’t try to write your own substring method. Instead, use the built-in `in`python keyword.

Challenge

Now that you know how to check if a string contains a substring, let’s talk about the challenge. We’re going to write a simple address search engine which filters on two keywords rather than one: street and number. However, we may not get both pieces of information at the time of search. As a result, we need to deal with finding addresses which exactly match whatever keywords are available.

For this challenge, you can write any solution you want as long as it prints out a list of addresses that exactly matches the search terms. For instance, take the following list of addresses:

addresses = [
    "123 Elm Street",
    "123 Oak Street",
    "678 Elm Street"
]

If a user searches just “Elm Street”, then I would expect the solution to return “123 Elm Street” and “678 Elm Street”. Likewise, if a user searches “123”, then I would expect the solution to return “123 Elm Street” and “123 Oak Street”. However, if the user provides both “123” and “Elm Street”, I would expect the solution to only return “123 Elm Street”—not all three addresses.

Here’s how I might expect the program to work:

search(addresses, "123", None)  # Returns "123 Elm Street" and "123 Oak Street"
search(addresses, "123", "Elm Street")  # Returns "123 Elm Street"
search(addresses, None, "Elm Street")  # Returns "123 Elm Street" and "678 Elm Street"

Feel free to have fun with this. For example, you could choose to write an entire front end for collecting the street and number keywords, or you could assume both of those variables already exist.

In terms of input data, feel free to write your own list of addresses or use my simple example. Alternatively, you can use a website which generates random addressesOpens in a new tab..

Ultimately, the program needs to demonstrate filtering on two keywords. In other words, find a way to modify one of the solutions from this article to match the street, address, or both—depending on what is available at the time of execution.

When you have your solution, head on over to Twitter and give it a share using the hashtag #RenegadePythonOpens in a new tab. just like the one below:

If I see your solution, I’ll give it a share!

A Little Recap

And with that, we’re finished. As a final recap, here are all the solutions you saw today:

addresses = [
    "123 Elm Street",
    "531 Oak Street",
    "678 Maple Street"
]
street = "Elm Street"

# Brute force (don't do this)
for address in addresses:
    address_length = len(address)
    street_length = len(street)
    for index in range(address_length - street_length + 1):
        substring = address[index:street_length + index]
        if substring == street:
            print(address)

# The index method
for address in addresses:
    try:
        address.index(street)
        print(address)
    except ValueError:
        pass

# The find method
for address in addresses:
    if address.find(street) > 0:
        print(address)

# The in keyword (fastest/preferred)
for address in addresses:
    if street in address:
        print(address)

As always, if you liked this article, make sure to give it a share. If you’d like more articles like this to hit your inbox, hop on my mailing list. While you’re at it, consider joining me on PatreonOpens in a new tab..

If you’re interested in learning more Python tricks, check out some of these related articles:

Finally, check out some of these Python resources on Amazon (ad):

Otherwise, that’s all I have. Thanks again for your support!

How to Python (42 Articles)—Series Navigation

The How to Python tutorial series strays from the usual in-depth coding articles by exploring byte-sized problems in Python. In this series, students will dive into unique topics such as How to Invert a Dictionary, How to Sum Elements of Two Lists, and How to Check if a File Exists.

Each problem is explored from the naive approach to the ideal solution. Occasionally, there’ll be some just-for-fun solutions too. At the end of every article, you’ll find a recap full of code snippets for your own use. Don’t be afraid to take what you need!

If you’re not sure where to start, I recommend checking out our list of Python Code Snippets for Everyday Problems. In addition, you can find some of the snippets in a Jupyter notebook format on GitHubOpens in a new tab.,

If you have a problem of your own, feel free to ask. Someone else probably has the same problem. Enjoy How to Python!

Jeremy Grifski

Jeremy grew up in a small town where he enjoyed playing soccer and video games, practicing taekwondo, and trading Pokémon cards. Once out of the nest, he pursued a Bachelors in Computer Engineering with a minor in Game Design. After college, he spent about two years writing software for a major engineering company. Then, he earned a master's in Computer Science and Engineering. Today, he pursues a PhD in Engineering Education in order to ultimately land a teaching gig. In his spare time, Jeremy enjoys spending time with his wife, playing Overwatch and Phantasy Star Online 2, practicing trombone, watching Penguins hockey, and traveling the world.

Recent Posts