3.8. Semaphores¶

Semaphores are different from the other forms of IPC discussed in this chapter, because they do not allow for the general exchange of data. That is, you cannot use a semaphore to transmit application-specific information from one process to another. Rather, semaphores are used to synchronize access to shared resources; that is, semaphores control the timing of shared accesses that may conflict.

As a preliminary example, consider a multi-process database application that handles banking information. Assume that one process is accessing the file for your account to record a deposit that you’ve made. While this is happening, another process also opens your account file to record a purchase. You want to make sure that the transaction for the deposit is completely recorded before money is removed from your account. That is, you want the application to control the timing of these events.

Synchronization is a complex topic that will be covered in its own chapter. In this section, we will start with the simplest case of synchronization with semaphores: signaling to another process that an action has been performed. This concept is very similar to the signals used to control the life cycle of a process (e.g., SIGKILL). The form of signaling that we are now discussing is more generic and allows applications to define their own custom events. That is, semaphores let processes inform other processes that something has happened, and that something is a custom event that only matters to that application. This type of signaling can be considered to follow the message passing IPC model, as each signal sent requires a system call.

Fundamentally, a semaphore is a non-negative integer [1] with two operations: incrementing or decrementing the value by 1. For a variety of reasons, these two operations have many different names that can be used interchangeably. Incrementing a semaphore is also called upping, signaling, posting, and V (from the Dutch word verhogen, “to raise”). Decrementing a semaphore is also called downing, waiting, or P (from the Dutch proberen, “to test”). [2]

The key behavior is that any process that attempting to decrement a semaphore whose current value is 0 will become blocked until another process increments the value. If the current value is greater than 0 when a process decrements the value, the process can continue processing without waiting. When using semaphores, a simple heuristic to consider for the initial value is to ask how many processes should be given immediate access. Specifically, the initial value determines how many decrements can occur (without any corresponding increments) before processes start to get blocked. If a semaphore is initialized to 1, a single process can decrement the semaphore without waiting; a second process would be blocked. Initializing the semaphore to 10 would allow 10 processes through. This topic will be discussed in more detail in Chapters 8 and 9.

To illustrate the basic signaling pattern described above, consider two processes (call them A and B) that have access to a shared memory region. Assume that A is writing data to the region and B needs to know when the writing is completed. (For simplicity, this scenario is a one-time data exchange. Once A is finished writing, it will not write again.) To make the signaling happen, A and B share access to a semaphore initialized to 0. A will increment the semaphore after it has completed writing the data and B will decrement the semaphore before it attempts to read. Observe the timings that are possible with this sequence of events:

In one ordering, B attempts to decrement the semaphore before A has completed writing. Since the semaphore’s value is 0, B becomes blocked and must wait until A finishes. A eventually completes the write and increments the semaphore, unblocking B.

In a different ordering, A finishes writing and increments the semaphore before B tries to read. When B later goes to read the data, the semaphore’s value will be 1 (because A already incremented the value) and it can continue without blocking.

In either order, the semaphore guarantees that B cannot read any data until A has completed writing to the shared memory.

Bug Warning

In the scenario above, it is important to initialize the semaphore to 0 to achieve the proper timing. A common mistake here would be to initialize the semaphore to 1, thinking that we only want one of the two processes to access the shared memory. This practice is known as mutual exclusion, and will be discussed as an important synchronization pattern. The problem in this scenario is that mutual exclusion does not ensure the proper timing. Specifically, if the semaphore is initialized to 1 and B decrements the semaphore first, A will be blocked. Consequently, A cannot even begin writing to the shared memory and both processes will end up waiting indefinitely.

3.8.1. POSIX vs. System V Semaphores¶

Semaphores, like message queues and shared memory, have both a POSIX and System V interface specification. Key differences between the two versions of semaphores include the following:

POSIX semaphores are created and initialized in a single operation, one at a time. System V semaphores are created as a set, and each semaphore can be initialized independently later.

POSIX semaphores can only be incremented or decremented by 1. System V semaphore operations allow processes to specify any unsigned integer that will be added to the semaphore’s value immediately.

POSIX semaphores provide non-blocking operations that try to decrement the value. If the value is currently 0 (which would block the process), the attempt will fail and the process can react in other ways. System V semaphores do not provide this feature.

System V semaphores are identified by key_t values. POSIX semaphores can be identified by valid POSIX names (begin with a “/”) or they can be unnamed.

3.8.2. POSIX Named Semaphores¶

POSIX defines two types of semaphores: named and unnamed. Named semaphores are created using the standard POSIX arguments of name, oflag, and mode, along with an initial unsigned integer value. The mode and value parameters must both be included when creating a new semaphore, and both must be excluded when connecting to an existing semaphore. The sem_wait() and sem_post() functions decrement (wait) or increment (post) the semaphore’s value. If the value is currently 0, sem_wait() will block the current process until the value is changed by another process calling sem_post(). Note that sem_post() will only unblock one process at a time; if five processes are waiting on the semaphore, then five calls to sem_post() must be made to unblock them all.

C library functions – <semaphore.h>

sem_t *sem_open (const char *name, int oflag, ... /* mode_t mode, unsigned int value */ );: Returns (and optionally creates) a named semaphore.
int sem_wait (sem_t *sem);: Decrement the semaphore’s value; block if the value is currently 0.
int sem_post (sem_t *sem);: Increment the semaphore’s value; resume another process if the value is 0.
int sem_close (sem_t *sem);: Close a semaphore.
int sem_unlink (const char *name);: Delete a named semaphore.

Code Listing 3.11 illustrates how semaphores can be used to control the timing of two processes. In this example, the semaphore is used to guarantee the messages are printed in the correct order (“first” then “second” then “third”). Specifically, the semaphore blocks the child process until the parent prints “first” and ups the semaphore; then the parent calls wait(), which forces it to wait until the child exits. It is important to emphasize that named semaphores can be used by unrelated processes; they do not have to be parent and child as used in this example.

/* Code Listing 3.11:
   Creating and using a POSIX semaphore to control the timing of parent/child execution
 */

/* Create and open the semaphore */
sem_t *sem = sem_open ("/OpenCSF_Sema", O_CREAT | O_EXCL, S_IRUSR | S_IWUSR, 0);
assert (sem != SEM_FAILED);

/* Fork to create the child process */
pid_t child_pid = fork();
assert (child_pid != -1);

/* Note the child inherits a copy of the semaphore connection */

/* Child process: wait for semaphore, print "second", then exit */
if (child_pid == 0)
  {
    sem_wait (sem);
    printf ("second\n");
    sem_close (sem);
    return 0;
  }

/* Parent prints then posts to the semaphore and waits on child */
printf ("first\n");
sem_post (sem);
wait (NULL);

/* Now the child has printed and exited */
printf ("third\n");
sem_close (sem);
sem_unlink ("/OpenCSF_Sema");

In many circumstances, a process may wish to check on the status of a semaphore without getting blocked indefinitely. For instance, assume one process is using a semaphore to signal that it has completed some important task; another process may want to check on whether the task has been completed, but still continue processing even if the event has not happened yet. Downing the semaphore with sem_wait() would not work, because it would block the process until the event occurs. Instead, POSIX provides two alternative functions for waiting on a semaphore. With sem_trywait(), a process can try to down the semaphore; however, if the semaphore’s current value is 0, then sem_trywait() returns an error instead of blocking. Alternatively, sem_timedwait() allows the process to specify a maximum amount of time to block; if no post occurs before the timeout arrives, then the process is unblocked and an error is returned.

C library functions – <semaphore.h>

int sem_trywait (sem_t *sem);: Try to decrement the semaphore; return an error if doing so would block
int sem_timedwait (sem_t *sem, const struct timespec *abs_timeout);: Decrement the semaphore, but place a time limit on the blocking

3.8.3. POSIX Unnamed Semaphores¶

POSIX unnamed semaphores provide a lightweight approach for creating and using semaphores. Specifically, where sem_open() returns a pointer to a newly allocated semaphore, sem_init() takes a reference to a semaphore variable (declared as a sem_t) that has already been allocated within the current process’s memory space and sets the semaphore to the value passed. Once the semaphore is initialized, the other functions specified for named semaphores (e.g., sem_post() and sem_wait()) can be used.

C library functions – <semaphore.h>

int sem_init (sem_t *sem, int pshared, unsigned int value);: Create and initialize a POSIX unnamed semaphore.
int sem_destroy (sem_t *sem);: Delete a POSIX unnamed semaphore.

POSIX unnamed semaphores have one very important difference from named semaphores: unnamed semaphores must exist in shared memory. That is, the preceding code example would not work with unnamed semaphores, because the parent and child processes have distinct memory spaces. As such, the sem_wait() and sem_post() calls in that code are meaningless, because they are operating on two different semaphores. However, if a shared memory space were set up with shmat(), the semaphore could be stored within that region and the code would work.

Bug Warning

macOS provides POSIX named semaphores, but it does not support unnamed semaphores. macOS programs written with unnamed semaphores will (unfortunately) compile, but the program will not behave correctly. The sem_wait() function will return an error and will not block the process. As noted previously, applications targeting macOS should generally use the System V IPC interface rather than POSIX.

The shared memory requirement for POSIX unnamed semaphores limits their usefulness in terms of IPC. They are much more commonly used with multithreaded programs, since multiple threads in the same process automatically share the same memory space. Using unnamed semaphores in that case reduces the kernel system overhead, as the semaphore is stored within the process itself.

One scenario where POSIX unnamed semaphores are very helpful for IPC arises when creating a complex shared data structure. For instance, consider a binary search tree consisting of millions of nodes that is stored in a shared memory region. If the application needs to associate a distinct semaphore with each node, assigning and keeping track of millions of names would be a horrifying burden. Instead, each node’s struct declaration could include a sem_t field. This approach would simplify the management of the data structure, allowing each semaphore to be created or destroyed automatically with the tree node.

[1]

This section is describing the typical implementation of semaphores in most modern OS. The original definition of semaphores allowed the values to be negative, too. In the original version, decrementing the semaphore would first subtract one from the current value, then check if the result was negative; if so, the process would block. In modern implementations, the value is checked first; if the current value is zero, the process blocks and the subtraction does not occur until later. The difference in behavior has very subtle implications that go beyond the scope of this book. For our purposes, either implementation will work sufficiently.

[2] P() and V() were the original names for the semaphore operations. Semaphores were proposed by the Dutch computer scientist, Edsger Dijkstra. These names are never used in modern systems, but they are included here for historical reference. Our general preference is to use decrement/increment, given that this pair clearly indicate how the value changes. We also use wait/post, however, in relation to POSIX semaphores to match the functions (sem_wait() and sem_post()) as named in the interface.