As I further my quest in studying for the qualifying exam, I figured I’d step away from algorithms a bit and focus on something I’m quite a bit more comfortable with: process synchronization. In particular, I want to talk about the three main forms of process synchronization: locks, semaphores, and monitors.
Table of Contents
Process Synchronization Overview
At a high level, it’s important to understand why we need process synchronization. Or rather, what’s wrong with having asynchronous processes?
In general, there’s nothing wrong with asynchronous processing. In fact, in many cases, it’s ideal. After all, you and I are currently performing different tasks simultaneously, and it’s working out fine.
However, in computer systems, asynchronous processes sometimes need access to the same information. If that information is read-only, there aren’t really any issues. That said, if multiple processes are able to edit that information, there can be issues of data consistency.
A common example would be to have two processes that want to manipulate the same variable in shared memory. For instance, let’s say both processes are executing the same program where x is a shared variable:
y = x # Read x = y + 1 # Write
What is the final value of x if x starts at 5? Well, if the processes happen one after another as we’d hope, then x stores 7. However, x could also be 6. After all, what happens if both programs read x at the same time? Let’s take a look:
Process A: y = x # x = 5 Process B: y = x # x = 5 Process A: x = y + 1 # x = 6 Process B: x = y + 1 # x = 6
Obviously, we don’t want this sort of ambiguity, so we introduce something called process synchronization. Throughout the rest of this article, we’ll discuss a few mechanism for dealing with this problem. Naturally, much of the examples are borrowed from The Ohio State University’s CSE 6431 lecture notes:
- OSU CSE 6431 Process Synchronization Part 1 (Busy Waiting)
- OSU CSE 6431 Process Synchronization Part 2 (Semaphores)
- OSU CSE 6431 Process Synchronization Part 3 (Semaphores)
- OSU CSE 6431 Process Synchronization Part 4 (Monitors)
That said, the analysis is strictly my own.
The Critical Section Problem
The problem outlined in the overview is known as the critical section problem. In particular, the critical section is any section of code which accesses a shared variable. In the example above, x was a shared variable, and both the processes were trying to read it then write to it.
As demonstrated in that example, we can run into a race condition where a process beats another process to a shared variable before the slower process has a chance to finish its action. Of course, race conditions are undesirable because they make a program nondeterministic.
To deal with the critical section problem, we usually resort to some form of atomicity and mutual exclusion. In other words, we want to make sure that any accesses performed on a shared variable are done so in a safe way. Using the previous example, we want process A to complete its critical section before process B enters its critical section (i.e. no interleaving commands).
To solve the critical section problem, we can use a handful of mechanisms which will be described in the remainder of this article.
Process Synchronization Mechanisms
Up to this point, we’ve talked about why process synchronization is important. Now, we’re going to discuss how it’s accomplished. In the following subsections, we’ll lay out three of the common process synchronization mechanisms.
If we wanted to remove the race condition from our original example, how would we do it? Perhaps we could introduce some sort of loop that waits on yet another shared variable called a lock:
while lock == 1: pass lock = 1 y = x x = y + 1 lock = 0
Here, we’re trying to add a layer of protection between the processes. If process A grabs the lock first, process B will be stuck waiting until process A sets lock back to zero.
That said, isn’t there still a race condition? Absolutely! Let’s say both processes manage to grab the lock at the same time. Then, they both would execute the critical section. So, what do we do?
As it turns out, just about every processor today has some form of locking mechanism which we can use in this situation. In particular, that mechanism is called test-and-set, and we can use it to rewrite our code above:
while test-and-set(lock) == 1: pass y = x x = y + 1 lock = 0
While this might not look much different, we’re actually guaranteed proper process synchronization because test-and-set is an atomic operation. In other words, there is no race condition—only one process can ever acquire the lock.
While these sort of busy locks are simple, they are wasteful. In particular, busy waiting with loops can waste a lot CPU cycles. Fortunately, there are other methods of process synchronization.
Instead of using locks, we could use semaphores which are integers with a little extra functionality. In particular, they have two atomic operations:
V(S) where S is the semaphore.
On one hand,
P(S) decrements S as long as S is greater than zero. In addition, if
P(S) decrements S, then the process that called
P(S) continues executing. Otherwise, the process is blocked and placed in a queue to wait.
On the other hand,
V(S) checks if there are any processes waiting on the queue. If there are,
V(S) unblocks one of those processes. Otherwise,
V(S) increments S.
As we can probably imagine, we can use a semaphore to then protect a critical section by wrapping it in these two function calls:
P(mutex) y = x x = y + 1 V(mutex)
Here, we are assuming that our mutex (mutual exclusion object) starts with a value of one. Whichever process acquires the mutex first is free to execute its critical section. Meanwhile, the other process is blocked until the first process exits its critical section.
Incidentally, we could have any arbitrary number of processes running concurrently and this solution would protect our critical section. That said, semaphores are tricky to use and difficult to test.
Like semaphores, monitors also provide mutual exclusion with an added layer of abstraction. In particular, monitors consist of shared data and a set of procedures. In order to interact with the shared data, the process must use the defined procedures which are protected through mutual exclusion. In other words, only one process can interact with the monitor at a time.
Also like semaphores, monitors introduce condition variables which support two operations:
signal. In addition, each condition variable has its own queue which can be used to check if any processes are waiting.
wait causes the current process to stop processing. Once stopped, the process is added to that condition variable’s queue. Meanwhile,
signal tells the next process in that condition variable’s queue to begin executing.
In practice, we’d need to define some condition variable as well as a set of procedures. However, for our simple case, we don’t even need a condition variable. After all, monitor procedures guarantee mutual exclusion:
procedure increment(x): y = x x = y + 1
In a future article, we’ll look at a few problems like readers-writers and producers-consumers where the condition variables are necessary for tracking available resources.
Want to Learn More?
At this point, I’m two topics into studying for the qualifying exam. If you felt like anything was unclear, that’s probably because I don’t understand it well enough for this exam, so feel free to share your feedback.
Otherwise, check out the following related books:
- The Elements of Computing Systems: Building a Modern Computer from First Principles by Nisan and Schocken
- The Design of the UNIX Operating System by Bach
As always, I don’t necessarily endorse the products listed above. At the very least, I try to pick highly rated products that are relevant to the article. If you have any recommendations for good products, let me know in the comments.
If books aren’t your thing, why not continue browsing the site a bit? Below are a few articles that I think you might enjoy:
- Be Careful When Copying Mutable Data Types
- The Difference Between Statements and Expressions
- Reflecting on My First Semester of Teaching
At any rate, thanks for taking some time out of your day to support the site!
Recently, I was giving a lecture about Java's "common" methods (i.e., all of the methods of Object), and I had epiphany about how Java only has toString() while Python has str() and repr(). So, it...
Magic numbers are numerical constants that have no clear meaning in the code and therefore make code harder to read. Anything that makes code harder to read is something we can use to obfuscate our...