How to Open a File in Python: open(), pathlib, and More

How to Open a File in Python Featured Image

At long last, I’ve decided to finally get over my fear of Input/Output long enough to write another article about files. In particular, we’re going to take a look at the process behind opening a file in Python.

For those of you short on time, the quickest way to open a file in Python is take advantage of the open() function. Specifically, all we need to do is pass a path to the function: open('/path/to/file/'). Alternatively, we can take advantage of the pathlib module which allows us to store Path objects.

If that’s not enough to get you started, keep reading! Otherwise, I’d appreciate it if you took a moment to check out the list of ways to help grow the site. Thanks again for the support!

Table of Contents

Problem Description

As this series grows, I find myself constantly pushed into uncomfortable domains. For example, a lot of people use Python for data science, so I feel some pressure to write about libraries like Pandas and Numpy. Likewise, one topic that comes up a lot is Input/Output—specifically working with files.

Now, I’ve sort of avoided talking about files in this series because files are complex. They can come in many, many different shapes and sizes, and they’re never consistent across platforms.

To add insult to injury, Python has expanded its file support over time. As a result, you really have to take care when listing solutions because they almost certainly won’t work in all versions of Python. In fact, I saw this issue in my file existence article from way back.

That said, today I’ve decided to wade back out into the dark territory that is IO. Specifically, we’re going to talk about how to open a file in Python. Basically, that means we’re going to look at some different ways to access a file for reading and writing.

Fortunately, Python is quite a bit less painful to work with than languages like Java or C. In other words, we should find IO to be a piece of cake (with lots of caveats along the way).

Solutions

If you’ve been around this series for any amount of time, you know that I like to pool together a whole series of solutions. Of course, each list comes with the caveat that not all solutions are applicable in ever scenario. For example, the first solution in this should almost never be used, but I included it for the sake of tradition.

With that said, let’s go ahead and take a look a few ways to open a file in Python.

Open a File with Shell Commands

With Python being a high-level language, there are tons of utilities built directly into the language for opening files. Of course, if you know me, I always like to take my first swipe at the challenge the hard way. In other words, I wanted to see if there was a way to open a file without using any straightforward functions.

Naturally, the first thing I though about were shell commands. In other words, what if there was some way to interact with the command line directly? That way, I could just run Windows or Linux commands to open a file.

Unsurprisingly, Python has an interface for this. All we have to do is import the os library and run the commands directly:

import os
os.system('type NUL > out.txt')  # Windows only

Here, we create an empty file called “out.txt” in the current working directory. Unfortunately, this doesn’t really open a file in the sense that we don’t have a file reference to play with—though I’m sure we could read a file using this same syntax.

That said, this solution gives us a lot of flexibility, and if we want even more flexibility, we can rely on the subprocess module. However, I have no desire to go down that rabbit hole when there are so many better solutions to follow.

Open a File with the Open Function

If you’re like me, and you’re first language was Java, you know how painful it can be to open a file. Luckily, Python has a built-in function to make opening a file easy:

open('/path/to/file')

Of course, it’s a bit more clunky to use because it can throw an exception. For example, if the file doesn’t exist, the code will crash with the following error:

>>> open('/path/to/file')
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    open('/path/to/file')
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/file'

As a result, a call to open() is usually wrapped in a try/except:

try:
  open('/path/to/file')
except FileNotFoundError:
  pass

That way, if the error does arise, we have a mechanism for dealing with it.

As an added wrinkle, opening a file introduces a resource to our program. As a result, it’s also good practice to close the file when we’re done with it:

try:
  my_file = open('/path/to/file')
  my_file.close()
except FileNotFoundError:
  pass

Or, if we’re clever, we can take advantage of the with statement:

try:
  with open('/path/to/file') as my_file:
    pass
except FileNotFoundError:
  pass

This cleans up the code quite a bit! Now, we don’t have to explicitly close the file.

The only thing left to mention are our options. After all, it’s not enough just to open the file. We need to specify some parameters. For example, are we going to open the file just for reading? Then, we should probably open in reading mode:

try:
  with open('/path/to/file', 'r') as my_file:
    pass
except FileNotFoundError:
  pass

Alternatively, if we wanted to read and write to the file, we can use “r+”:

try:
  with open('/path/to/file', 'r+') as my_file:
    pass
except FileNotFoundError:
  pass

For those that are interested, here’s a (mostly) complete table of modes:

ModeDescription
rOpens an existing file as text for reading only
wOpens a new file or overwrites an existing file as text for writing only
aOpens a new file or overwrites an existing file as text for writing where new text is added to the end of the file (i.e. append)
r+Opens an existing file as text for reading and writing
w+Opens a new file or overwrites an existing file as text for reading and writing
a+Opens a new file or overwrites an existing file as text for reading and writing where new text is added to the end of the file (i.e. append)
rbOpens an existing file as binary for reading only
wbOpens a new file of overwrites an existing file as binary for writing only
abOpens a new file or overwrites an existing file as binary for writing where new text is added to the end of the file (i.e. append)
rb+Opens an existing file as binary for reading and writing
wb+Opens a new file or overwrites an existing file as binary for reading and writing
ab+Opens a new file or overwrites an existing file as binary for reading and writing where new binary is added to the end of the file (i.e. append)

In addition, there are a handful of other modes that you can read more about in the documentationOpens in a new tab.. That said, keep in mind that a lot of the concepts mentioned here are still useful in following solutions.

Open a File with the pathlib Module

While the open() function is handy, there is another option that’s a bit more robust: the pathlib module. Basically, this module allows us to think of files at a higher level by wrapping them in a Path object:

from pathlib import Path
my_file = Path('/path/to/file')

Then, opening the file is as easy as using the open() method:

my_file.open()

That said, many of the same issues still apply. For example, running the code above will result in the following error:

>>> my_file = Path('/path/to/file')
>>> my_file.open()
Traceback (most recent call last):
  File "<pyshell#16>", line 1, in <module>
    my_file.open()
  File "C:\Users\Jeremy Grifski\AppData\Local\Programs\Python\Python38-32\lib\pathlib.py", line 1213, in open
    return io.open(self, mode, buffering, encoding, errors, newline,
  File "C:\Users\Jeremy Grifski\AppData\Local\Programs\Python\Python38-32\lib\pathlib.py", line 1069, in _opener
    return self._accessor.open(self, flags, mode)
FileNotFoundError: [Errno 2] No such file or directory: '\\path\\to\\file'

Look familiar? It should! After all, we ran into this error when we tried to open this imaginary file before. In other words, all the same rules apply. For example, a mode can be passed along as needed:

my_file.open('a')

That said, pathlib is nice because it provides a lot of helpful methods. For instance, instead of using a try/except, we can use one of the helpful boolean methods:

if my_file.exists():
  my_file.open('a')

Of course, there’s a bit of a catch here. If for some reason the file is deleted after we check if it exists, there will be an error. As a result, it’s usually a safer bet to use the try/except strategy from before.

Overall, I’m a big fan of this solution—especially when I want to do more than read the file. For instance, here’s a table of helpful methods that can be executed on these Path objects:

MethodDescription
chmod()Change the file mode and permissions
is_file()Returns True if the path is a file
mkdir()Creates a directory at the given path
rename()Renames the file/directory at the given path
touch()Creates a file at the given path

Of course, if you’re interested in browsing the entire suite of methods, check out the documentation. In the meantime, we’re going to move on to performance.

Performance

In my experience, IO is a bit of a pain to test because we usually need to run our tests for at least two scenarios: the file either exists or it doesn’t. In other words, for every possible test we come up with, we have to test it once for an existing file and again for a nonexistent file.

Now, to make matters worse, we also have a ton of modes to explore. Since I didn’t purposefully limit the scope of this article, that means we have a lot to test. For simplicity, I’m going to only test two modes: read and write. I have no idea if there will be a performance difference here, but I’m interested in exploring it.

With those caveats out of the way, let me remind everyone that we use timeit for all my performance tests. For these tests, we’ll need to create strings of all the different tests we’d like to try. Then, it’s just a matter of running them. If you’re interested in learning more about this process, I have an article about performance testing just for you. Otherwise, here are the strings:

setup = """
import os
from pathlib import Path
"""

system_commands = """
os.system('type NUL > out.txt')
"""

open_r = """
open("out.txt", "r")  # Existing file
"""

open_w = """
open("out.txt", "w")  # Existing file
"""

path_r = """
Path("out.txt").open("r")  # Existing file
"""

path_w = """
Path("out.txt").open("w")  # Existing file
"""

As we can see, none of these solutions are written with a nonexistent file in mind. I realized that those would be a bit more difficult to test because we would have to delete the file between executions (at least for the write solutions). As a result, I chose to leave them out. Feel free to test them yourself and let me know what you find.

At any rate, now that we have our strings, we can begin testing:

>>> import timeit
>>> min(timeit.repeat(setup=setup, stmt=open_r))
462.8889031000001
>>> min(timeit.repeat(setup=setup, stmt=open_w))
201.32850720000033
>>> min(timeit.repeat(setup=setup, stmt=path_r))
576.0263794000002
>>> min(timeit.repeat(setup=setup, stmt=path_w))
460.5153201000003

One thing that’s worth mentioning before we discuss the results is that I had to exclude the system command solution. Whenever it was executed, a command prompt launched on my system. It was so slow that I didn’t bother finishing the test.

With that said, IO is an extremely slow process in general. Even without the fun little window spam, these solutions took forever to test. In fact, I wouldn’t even read too far into these metrics because there’s just too much variability between runs.

That said, I’m most interested in the difference between the speed of reading versus writing when using the open() function. It makes me wonder how much more work goes into preparing a file for reading versus writing. However, I didn’t see quite as dramatic of a difference with the pathlib solutions.

If anyone is interested in doing a bit more research, I’d love to know more about the inner workings of these solutions. In general, I’m a fairly skeptical of my metrics, but I don’t have a ton of time to play around with these sort of things.

At any rate, let’s move on to the challenge!

Challenge

Now that we’ve had a chance to look at the performance, we can move on to the challenge. After having a chance to play around with file opening, I figured the sky’s the limit for IO challenges. As a result, I wasn’t really sure where to start.

At first, I thought it might be interesting to try to put together a quine which is a program that duplicates itself. Unfortunately, these are usually done through standard output and not to files. In fact, I wasn’t able to find any examples that output to a file, so I decided wasn’t the way to go.

Instead, I figured we could take this idea of opening files a step further by moving on to file reading. In other words, now that we know how to open a file, what would it take to read the contents of that file? Specifically, I’m interested in writing a program similar to cat for linux users:

cat example.txt  # Outputs the contents of the file

This program should prompt the user for a file name and output the contents to standard out. In addition, it’s safe to assume the supplied file is text, but you’re welcome to create a more robust program if desired:

>>> Please enter the path to a text file: example.txt
Here are some sample file contents!

Naturally, a solution to this challenge will involve one of the file opening methods discussed in this article. From there, it’s up to you to decided how you want to read and display the file.

As always, I’ve come up with a solution already! Check it out:

If you’d like to share your own solution, head on over to Twitter and share your solution using the hashtag #RenegadePythonOpens in a new tab.. Alternatively, you can share your solution with our GitHub repo, and I’ll tweet it out if you want. I’m excited to see what you come up with!

A Little Recap

At long last, we’re finished! Here’s are all the solutions in one place:

# "Open" a file with system commands
import os
os.system('type NUL > out.txt')

# Open a file for reading with the open() function
open("out.txt", "r")

# Open a file for reading with the pathlib module
from pathlib import Path
Path("out.txt").open("r")

If you liked this article, and you want to show your support, head on over to my list of ways you can help grow the site. Over there, you’ll find links to my YouTube channel, Patreon, and newsletter.

While you’re here, check out some of these related articles:

Likewise, here are some helpful resources from Amazon (ad):

Otherwise, thanks for sticking around! I hope to see you back here soon.

Series Navigation← How to Remove Duplicates From a List in Python: Sets, Dicts, and MoreHow to Swap Variables in Python: Temporary Variables and Iterable Unpacking →

Jeremy Grifski

Jeremy grew up in a small town where he enjoyed playing soccer and video games, practicing taekwondo, and trading Pokémon cards. Once out of the nest, he pursued a Bachelors in Computer Engineering with a minor in Game Design. After college, he spent about two years writing software for a major engineering company. Today, he pursues a PhD in Engineering Education in order to ultimately land a teaching gig. In his spare time, Jeremy enjoys spending time with his wife, playing Overwatch and Phantasy Star Online 2, practicing trombone, watching Penguins hockey, and traveling the world.

Recent Content