3.3. Pipes and FIFOs¶

Pipes allow processes to communicate using a unidirectional byte stream, with two ends designated by distinct file descriptors. A common visual analogy for a pipe is a real-world water pipe; water that is poured into one end of the pipe comes out the other end. In the case of IPC, the “water” is the sequence of bytes being sent between the processes; the bytes are written into one end of the pipe and read from the other end. Pipes have several important characteristics that shape their use:

Pipes are unidirectional; one end must be designated as the reading end and the other as the writing end. Note that there is no restriction that different processes must read from and write to the pipe; rather, if one process writes to a pipe then immediately reads from it, the process will receive its own message. If two processes need to exchange messages back and forth, they should use two pipes.

Pipes are order preserving; all data read from the receiving end of the pipe will match the order in which it was written into the pipe. There is no way to designate some data as higher priority to ensure it is read first.

Pipes have a limited capacity and they use blocking I/O; if a pipe is full, any additional writes to the pipe will block the process until some of the data has been read. As such, there is no concern that the messages will be dropped, but there may be performance delays, as the writing process has no control over when the bytes will be removed from the pipe.

Pipes send data as unstructured byte streams. There are no pre-defined characteristics to the data exchanged, such as a predictable message length. The processes using the pipe must agree on a communication protocol and handle errors appropriately (such as if one of the processes terminates the communication early).

Messages that are smaller than the size specified by PIPE_BUF are guaranteed to be sent atomically. As such, if two processes write to a pipe at the same time, both messages will be written correctly and they will not interfere with each other.

3.3.1. Basic Pipes¶

The simplest form of communication with pipes is to provide parent-child communication using the pipe() library function. This function takes an int array of length 2. (Recall that arrays are always passed by reference.) Assuming the kernel is able to create the pipe, the array will contain the file descriptors for the two ends of the pipe. If the pipe creation, the function returns -1.

C library functions – <unistd.h>

int pipe (int pipefd[2]);: Opens a pipe and returns the file descriptors in the array.

Once the pipe is opened, we can use the standard read() and write() functions with the file descriptors. Code Listing 3.1 show the standard convention of using the array index 1 for writing and 0 for reading. This practice aligns with the use of file descriptor 1 for stdout and 0 for stdin.

/* Code Listing 3.1:
   Sending a simple message through a pipe to a child process
 */

int pipefd[2];
char buffer[10];
/* Clear the buffer */
memset (buffer, 0, sizeof (buffer));

/* Open the pipe */
if (pipe (pipefd) < 0)
  {
    printf ("ERROR: Failed to open pipe\n");
    exit (1);
  }

/* Create a child process */
pid_t child_pid = fork ();
assert (child_pid >= 0);
if (child_pid == 0)
  {
    /* Child closes write-end, then reads from the pipe */
    close (pipefd[1]);
    ssize_t bytes_read = read (pipefd[0], buffer, 10);
    if (bytes_read <= 0)
      exit (0);

    printf ("Child received: '%s'\n", buffer);
    exit (0);
  }

/* Parent closes the unused reading end */
close (pipefd[0]);

/* Parent sends 'hello' and waits */
strncpy (buffer, "hello", sizeof (buffer));
printf ("Parent is sending '%s'\n", buffer);
write (pipefd[1], buffer, sizeof (buffer));
wait (NULL);
printf ("Child should have printed the message\n");

This code illustrates several conventions with using pipes. First, for the parent and child to communicate, pipe() must be called before fork(). This ordering is required to give both processes access to the pipe. Second, it is customary for each process to close one end of the pipe immediately after the fork(). This practice helps to align the unidirectional nature of the pipe with its intended use. That is, the convention is that a pipe is reserved for sending data from one process to another; if the second process wants to respond, it should use a different pipe. Third, the file descriptor at index 1 is used for writing, while index 0 is for reading.

Bug Warning

It is very important to follow the convention of closing the unused end of the pipe immediately after the fork(). Failure to do so can cause programs to freeze unexpectedly. Consider the following example:

int pipefd[2];
pipe (pipefd);

if (fork () == 0)
  exit (0); /* child exits without writing */

char buffer[10];
read (pipefd[0], buffer, sizeof (buffer));

On line 8, the parent process will try to read from the pipe. Instead of immediately returning, the process will block until an EOF (end-of-file) is written into the pipe. Since the child is the only other process that could write to the pipe and the child exits without writing anything, the parent will block indefinitely. This problem can be difficult to diagnose, such as when the child process is redirecting the standard output of an external program using dup2() (which we illustrate below) and the external program produces no output. To avoid this frustration, always close the unused end of the pipe.

Figure 3.3.3: Pipes are unidirectional and should not be used to respond

Figure 3.3.3 illustrates a key feature and common bug with pipes. Once a process has used pipefd[0] in a call to read(), the same process cannot turn around and write() to pipefd[1]; doing so will fail silently and no data will be sent. To allow for bidirectional communication, use two pipes as in Code Listing 3.2. After performing the fork(), either pipe can be designated to be used for parent-to-child or child-to-parent messages.

/* Code Listing 3.2:
   Using two pipes for bidirectional communication between parent and child
 */

int p2cfd[2]; /* parent-to-child */
int c2pfd[2]; /* child-to-parent */
char buffer[10];
ssize_t bytes_read = 0;

/* Clear the buffer and open the pipe */
memset (buffer, 0, sizeof (buffer));
if ((pipe (p2cfd) < 0) || (pipe (c2pfd) < 0))
  {
    printf ("ERROR: Failed to open pipe\n");
    exit (1);
  }

/* Create a child process */
pid_t child_pid = fork ();
assert (child_pid >= 0);

if (child_pid == 0)
  {
    /* Child closes write end of p2c, read of c2p */
    close (p2cfd[1]);
    close (c2pfd[0]);
    bytes_read = read (p2cfd[0], buffer, 10);
    if (bytes_read <= 0)
      exit (0);
    printf ("Child received: '%s'\n", buffer);

    /* Child sends response of "goodbye" */
    strncpy (buffer, "goodbye", sizeof (buffer));
    write (c2pfd[1], buffer, 10);
    exit (0);
  }

/* Parent closes read end of p2c, write of c2p */
close (p2cfd[0]);
close (c2pfd[1]);

/* Parent sends 'hello' and waits */
strncpy (buffer, "hello", sizeof (buffer));
printf ("Parent is sending '%s'\n", buffer);
write (p2cfd[1], buffer, sizeof (buffer));

/* Parent reads response back from child */
bytes_read = read (c2pfd[0], buffer, 10);
if (bytes_read <= 0)
  exit (1); /* should receive response */
printf ("Parent received: '%s'\n", buffer);

Figure 3.3.4 illustrates the circular structure of the two pipes used in Code Listing 3.2. The parent uses p2cfd to send data to the child. Responses from the child use the other pipe, identified by c2pfd. In both cases, the calls to write() use index 1 and the read() calls use index 0.

Figure 3.3.4: Structure of the two pipes used in Code Listing 3.2

3.3.2. Pipes and Shell Commands¶

One of the most common use of pipes is to link together multiple commands on the command line. For instance, consider the following command line:

$ ls -l | sort -n -k 5 | tail -n 1 | awk '{print $NF}'

This command line creates four processes that are linked together. First, the ls command prints out the list of files along with their details. This list is sent as input to sort, which sorts numerically based on the 5th field (the file size). The tail process then grabs the last line, which is the line for the largest file. Finally, awk will print the last field of that line, which is the file name of whatever file is the largest.

Figure 3.3.5: Chained logical structure of a sequence of bash commands connected with pipes

Figure 3.3.5 illustrates the chained structure of these four processes. These four processes are created by bash using both fork() and exec(). Once the processes are created, bash links their standard input and output by setting up a pipe to connect each process with the one after it. (This is the reason the vertical bar (|) is referred to as a pipe.) However, there is an additional step: bash needs to link the pipe with each process’s standard input and output. The dup2() function accomplishes this task.

C library functions – <unistd.h>

int dup2 (int oldfd, int newfd);: Closes newfd and replaces it with the file of oldfd

Bug Warning

The arguments for dup2() are easily confused because of the names given in the standard C documentation. The newfd is the file descriptor that you want to use after the call to dup2(). For instance, if you want to change your file descriptors so that subsequent calls to printf() write to the pipe instead of the standard output screen, the newfd argument should be STDOUT_FILENO. (The confusion seems to arise because the pipe was created after standard output, so new programmers often think of the pipe file descriptor as “new,” which is incorrect.)

Code Listing 3.3 illustrates the basic functionality of how bash uses dup2() with pipes to link commands together. Specifically, the command line would be ls | sort, so bash needs to create and link two processes. The sort process closes the write end of the pipe and links the read end to become its standard input.

Similarly, the ls process closes its read end of the pipe and links the write end to its standard output. Anything that ls writes to its standard output (using printf()), sort would read from its standard input; both processes are unaware that the pipe exists. In fact, even if this code used exec() to load new program code within these child processes, the processes would continue to use the pipe as stdin and stdout without any change to the program’s source code.

/* Code Listing 3.3:
   Creating a bash-like linkage between two processes
 */

/* Parent is acting like 'bash' interpreting the command line:
     $ ls | sort
   This example assumes the variable declaration and pipe creation
   as shown in Code Listing 3.1. */

/* 'sort' child process */
assert ((child_pid = fork ()) >= 0);
if (child_pid == 0)
  {
    /* 'sort' closes unused write end of the pipe */
    close (pipefd[1]);
    /* ...and uses the read end as standard input */
    dup2 (pipefd[0], STDIN_FILENO);
    /* Reading from "stdin" now reads from the pipe */
    ssize_t bytes_read = read (STDIN_FILENO, buffer, sizeof (buffer));
    if (bytes_read <= 0)
      exit (0);
    /* Trim off the trailing newline character */
    char *token = strtok (buffer, "\n");
    printf ("'sort' process received '%s'\n", token);
    exit (0);
  }

/* 'ls' child process */
assert ((child_pid = fork ()) >= 0);
if (child_pid == 0)
  {
    /* 'ls' closes the read end of the pipe */
    close (pipefd[0]);
    /* ...and uses the write end as standard output */
    dup2 (pipefd[1], STDOUT_FILENO);

    /* printf() now writes to the pipe instead of the screen */
    printf ("list of files\n");
    exit (0);
  }
/* 'bash' parent closes both ends of the pipe within itself */
close (pipefd[0]);
close (pipefd[1]);
wait (NULL);

This example also illustrates one subtle aspect of the closing of pipes. When a process closes one end of the pipe, it is only closing its access to that end of the pipe. Other processes may still use that end of the pipe. For instance, in the previous example, observe that the parent process closes both pipefd[0] and pipefd[1]. This does not affect the pipe itself, and it does not prevent the two child processes from communicating via the pipe. All these two lines of code do are close the parent’s (bash’s) access to the pipe, preventing it from reading or writing to the pipe. The pipe will remain in existence until all processes with access either close both ends of the pipe or exit (which closes all open file connections).

3.3.3. FIFOs¶

The pipes described above create a simple mechanism for parent and child processes to communicate, but they cannot be used for unrelated processes. Specifically, notice that the call to pipe() must happen within the same program that later calls fork(). That is, the pipe is first created within a single process and is only shared with processes created as children (or children of children, if the child also calls fork()). This approach will not work for two random processes that need to create an ad hoc communication session. FIFOs (first-in, first-out) are a variation on a pipe that creates a more flexible communication structure.

FIFOs work by attaching a filename to the pipe. For this reason, FIFOs are also called named pipes as opposed to the anonymous pipes discussed previously. FIFOs are created by one process that calls mkfifo(). Once created, any process (with correct access permissions) can access the FIFO by calling open() on the associated filename. Once the processes have opened the file, they can use the standard read() and write() functions to communicate.

C library functions – <sys/stat.h>

int mkfifo (const char *pathname, mode_t mode);: Creates a new file identified by the pathname to use as a FIFO

A common use for FIFOs is to create client/server applications on the same machine. For example, consider an anti-virus server that runs in the background, scanning for corrupted files. When the system administrator wants to get a report on potentially bad files, they run a client application that uses a FIFO to initiate contact with the server. Both the server and the client application are distinct processes that are running separate programs. That is, neither process was created by either of them calling fork(). As such, an anonymous pipe() call would not work. Instead, both processes use the name attached to the FIFO to set up the communication.

As a simple scenario, consider a server that prints hello whenever a client writes a non-zero value to a file and shuts down when the client writes a zero. Code Listing 3.4 shows the structure of the server. The server starts by creating the FIFO with read and write permissions for the current user. Then, the server opens the FIFO in read-only mode and enters the listening loop. Once the server reads a value of 0 from the FIFO, it exits the loop, then closes and deletes the FIFO.

/* Code Listing 3.4:
   The basic structure of a server process using a FIFO
 */

/* Create the FIFO or die trying */
const char *FIFO = "/tmp/MY_FIFO";
assert (mkfifo (FIFO, S_IRUSR | S_IWUSR) == 0);

/* Try to open the FIFO. Delete FIFO if open() fails */
int fifo = open (FIFO, O_RDONLY);
if (fifo == -1)
  {
    fprintf (stderr, "Failed to open FIFO\n");
    unlink (FIFO);
    return 1;
  }

/* Main server loop */
while (1)
  {
    int req = 0;
    if (read (fifo, &req, sizeof (int)) != sizeof (int))
      continue;

    /* If we read a 0, quit; otherwise print hello */
    if (req == 0)
      break;
    printf ("hello\n");
  }

/* Read a 0 from the FIFO, so close and delete the FIFO */
close (fifo);
printf ("Deleting FIFO\n");
unlink (FIFO); 

Code Listing 3.5 shows a sample client. This client opens the FIFO, then writes a sequence of integers (5 down to 0) into the FIFO. Note that anything the client writes after the 0 would be thrown away, as the server would delete the FIFO at that point.

/* Code Listing 3.5:
   A client process that sends six messages to the server in Code Listing 3.4
 */

const char *FIFO = "/tmp/MY_FIFO";

/* Use the file name to open the FIFO for writing */
int fifo = open (FIFO, O_WRONLY);
assert (fifo != -1);

/* Open the FIFO 6 times, writing an int each time */
for (int index = 5; index >= 0; index--)
  {
    /* Write 5, 4, 3, 2, 1, 0 into the FIFO */
    int msg = index;
    write (fifo, &msg, sizeof (int));

    /* Add a slight delay each time */
    sleep (1);
  }

/* Close the FIFO */
close (fifo);

Although FIFOs use standard file I/O functions (e.g., open(), read(), and write()), it is important to note that they are not regular files. Specifically, once data has been read from a FIFO, the data is discarded and cannot be read again (just like an anonymous pipe). In contrast, with a regular file, multiple processes can read the same data from the same file. That is, regular files store data persistently, but FIFOs do not. Consequently, FIFOs cannot be used for broadcasting a single message to multiple recipients; only one process can read the data. Similarly, FIFOs (like pipes) are not suitable for bi-directional communication; if a process writes into the FIFO then immediately tries to read a response, it may read its own message!

Also similar to anonymous pipes, FIFOs use blocking I/O until both ends are opened by at least one process. As such, there is no concern about a process writing into the FIFO too soon; if no process has opened the FIFO for reading, the writing process will block until a reader becomes available. This behavior may be problematic if the writing process needs to perform some other task. To resolve this problem, pass the O_NONBLOCK option during the call to open() to make the FIFO access non-blocking.