Pipes allow processes to communicate using a unidirectional byte stream, with two ends designated by distinct file descriptors. A common visual analogy for a pipe is a real-world water pipe; water that is poured into one end of the pipe comes out the other end. In the case of IPC, the “water” is the sequence of bytes being sent between the processes; the bytes are written into one end of the pipe and read from the other end. Pipes have several important characteristics that shape their use:
- Pipes are unidirectional; one end must be designated as the reading end and the other as the writing end. Note that there is no restriction that different processes must read from and write to the pipe; rather, if one process writes to a pipe then immediately reads from it, the process will receive its own message. If two processes need to exchange messages back and forth, they should use two pipes.
- Pipes are order preserving; all data read from the receiving end of the pipe will match the order in which it was written into the pipe. There is no way to designate some data as higher priority to ensure it is read first.
- Pipes have a limited capacity and they use blocking I/O; if a pipe is full, any additional writes to the pipe will block the process until some of the data has been read. As such, there is no concern that the messages will be dropped, but there may be performance delays, as the writing process has no control over when the bytes will be removed from the pipe.
- Pipes send data as unstructured byte streams. There are no pre-defined characteristics to the data exchanged, such as a predictable message length. The processes using the pipe must agree on a communication protocol and handle errors appropriately (such as if one of the processes terminates the communication early).
- Messages that are smaller than the size specified by
PIPE_BUF
are guaranteed to be sent atomically. As such, if two processes write to a pipe at the same time, both messages will be written correctly and they will not interfere with each other.
The simplest form of communication with pipes is to provide parent-child communication using the
pipe()
library function. This function takes an int array of length 2. (Recall that arrays are
always passed by reference.) Assuming the kernel is able to create the pipe, the array will contain
the file descriptors for the two ends of the pipe. If the pipe creation, the function returns -1.
C library functions – <unistd.h>
int pipe (int pipefd[2]);
Once the pipe is opened, we can use the standard read()
and write()
functions with the file
descriptors. Code Listing 3.1 show the standard convention of using the array index 1 for writing
and 0 for reading. This practice aligns with the use of file descriptor 1 for stdout
and 0 for stdin
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | /* Code Listing 3.1:
Sending a simple message through a pipe to a child process
*/
int pipefd[2];
char buffer[10];
/* Clear the buffer */
memset (buffer, 0, sizeof (buffer));
/* Open the pipe */
if (pipe (pipefd) < 0)
{
printf ("ERROR: Failed to open pipe\n");
exit (1);
}
/* Create a child process */
pid_t child_pid = fork ();
assert (child_pid >= 0);
if (child_pid == 0)
{
/* Child closes write-end, then reads from the pipe */
close (pipefd[1]);
ssize_t bytes_read = read (pipefd[0], buffer, 10);
if (bytes_read <= 0)
exit (0);
printf ("Child received: '%s'\n", buffer);
exit (0);
}
/* Parent closes the unused reading end */
close (pipefd[0]);
/* Parent sends 'hello' and waits */
strncpy (buffer, "hello", sizeof (buffer));
printf ("Parent is sending '%s'\n", buffer);
write (pipefd[1], buffer, sizeof (buffer));
wait (NULL);
printf ("Child should have printed the message\n");
|
This code illustrates several conventions with using pipes. First, for the parent and child to
communicate, pipe()
must be called before fork()
. This ordering is required to give both
processes access to the pipe. Second, it is customary for each process to close one end of the pipe
immediately after the fork()
. This practice helps to align the unidirectional nature of the pipe
with its intended use. That is, the convention is that a pipe is reserved for sending data from one
process to another; if the second process wants to respond, it should use a different pipe. Third,
the file descriptor at index 1 is used for writing, while index 0 is for reading.
Bug Warning
It is very important to follow the convention of closing the unused end of the pipe
immediately after the fork()
. Failure to do so can cause programs to freeze
unexpectedly. Consider the following example:
1 2 3 4 5 6 7 8 | int pipefd[2];
pipe (pipefd);
if (fork () == 0)
exit (0); /* child exits without writing */
char buffer[10];
read (pipefd[0], buffer, sizeof (buffer));
|
On line 8, the parent process will try to read from the pipe. Instead of immediately
returning, the process will block until an EOF
(end-of-file) is written into the pipe.
Since the child is the only other process that could write to the pipe and the child
exits without writing anything, the parent will block indefinitely. This problem can
be difficult to diagnose, such as when the child process is redirecting the standard
output of an external program using dup2()
(which we illustrate below) and the
external program produces no output. To avoid this frustration, always close the
unused end of the pipe.
Figure 3.3.3 illustrates a key feature and common bug with pipes. Once a process
has used pipefd[0]
in a call to read()
, the same process cannot turn around and write()
to pipefd[1]
; doing so will fail silently and no data will be sent. To allow for bidirectional
communication, use two pipes as in Code Listing 3.2. After performing the fork()
, either pipe
can be designated to be used for parent-to-child or child-to-parent messages.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | /* Code Listing 3.2:
Using two pipes for bidirectional communication between parent and child
*/
int p2cfd[2]; /* parent-to-child */
int c2pfd[2]; /* child-to-parent */
char buffer[10];
ssize_t bytes_read = 0;
/* Clear the buffer and open the pipe */
memset (buffer, 0, sizeof (buffer));
if ((pipe (p2cfd) < 0) || (pipe (c2pfd) < 0))
{
printf ("ERROR: Failed to open pipe\n");
exit (1);
}
/* Create a child process */
pid_t child_pid = fork ();
assert (child_pid >= 0);
if (child_pid == 0)
{
/* Child closes write end of p2c, read of c2p */
close (p2cfd[1]);
close (c2pfd[0]);
bytes_read = read (p2cfd[0], buffer, 10);
if (bytes_read <= 0)
exit (0);
printf ("Child received: '%s'\n", buffer);
/* Child sends response of "goodbye" */
strncpy (buffer, "goodbye", sizeof (buffer));
write (c2pfd[1], buffer, 10);
exit (0);
}
/* Parent closes read end of p2c, write of c2p */
close (p2cfd[0]);
close (c2pfd[1]);
/* Parent sends 'hello' and waits */
strncpy (buffer, "hello", sizeof (buffer));
printf ("Parent is sending '%s'\n", buffer);
write (p2cfd[1], buffer, sizeof (buffer));
/* Parent reads response back from child */
bytes_read = read (c2pfd[0], buffer, 10);
if (bytes_read <= 0)
exit (1); /* should receive response */
printf ("Parent received: '%s'\n", buffer);
|
Figure 3.3.4 illustrates the circular structure of the two pipes used in Code
Listing 3.2. The parent uses p2cfd
to send data to the child. Responses from the child use the
other pipe, identified by c2pfd
. In both cases, the calls to write()
use index 1 and the
read()
calls use index 0.
One of the most common use of pipes is to link together multiple commands on the command line. For instance, consider the following command line:
$ ls -l | sort -n -k 5 | tail -n 1 | awk '{print $NF}'
This command line creates four processes that are linked together. First, the ls
command prints
out the list of files along with their details. This list is sent as input to sort
, which sorts
numerically based on the 5th field (the file size). The tail
process then grabs the last line,
which is the line for the largest file. Finally, awk
will print the last field of that line,
which is the file name of whatever file is the largest.
Figure 3.3.5 illustrates the chained structure of these four processes. These
four processes are created by bash using both fork()
and exec()
. Once the processes are
created, bash
links their standard input and output by setting up a pipe to connect each process
with the one after it. (This is the reason the vertical bar (|
) is referred to as a pipe.)
However, there is an additional step: bash
needs to link the pipe with each process’s standard
input and output. The dup2()
function accomplishes this task.
C library functions – <unistd.h>
int dup2 (int oldfd, int newfd);
Bug Warning
The arguments for dup2()
are easily confused because of the names given in the standard C
documentation. The newfd
is the file descriptor that you want to use after the call to
dup2()
. For instance, if you want to change your file descriptors so that subsequent calls to
printf()
write to the pipe instead of the standard output screen, the newfd
argument should
be STDOUT_FILENO
. (The confusion seems to arise because the pipe was created after standard
output, so new programmers often think of the pipe file descriptor as “new,” which is incorrect.)
Code Listing 3.3 illustrates the basic functionality of how bash
uses dup2()
with pipes to
link commands together. Specifically, the command line would be ls | sort
, so bash
needs to
create and link two processes. The sort
process closes the write end of the pipe and links the
read end to become its standard input.
Similarly, the ls
process closes its read end of the pipe and links the write end to its
standard output. Anything that ls
writes to its standard output (using printf()
), sort
would read from its standard input; both processes are unaware that the pipe exists. In fact, even
if this code used exec()
to load new program code within these child processes, the processes
would continue to use the pipe as stdin
and stdout
without any change to the program’s source code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | /* Code Listing 3.3:
Creating a bash-like linkage between two processes
*/
/* Parent is acting like 'bash' interpreting the command line:
$ ls | sort
This example assumes the variable declaration and pipe creation
as shown in Code Listing 3.1. */
/* 'sort' child process */
assert ((child_pid = fork ()) >= 0);
if (child_pid == 0)
{
/* 'sort' closes unused write end of the pipe */
close (pipefd[1]);
/* ...and uses the read end as standard input */
dup2 (pipefd[0], STDIN_FILENO);
/* Reading from "stdin" now reads from the pipe */
ssize_t bytes_read = read (STDIN_FILENO, buffer, sizeof (buffer));
if (bytes_read <= 0)
exit (0);
/* Trim off the trailing newline character */
char *token = strtok (buffer, "\n");
printf ("'sort' process received '%s'\n", token);
exit (0);
}
/* 'ls' child process */
assert ((child_pid = fork ()) >= 0);
if (child_pid == 0)
{
/* 'ls' closes the read end of the pipe */
close (pipefd[0]);
/* ...and uses the write end as standard output */
dup2 (pipefd[1], STDOUT_FILENO);
/* printf() now writes to the pipe instead of the screen */
printf ("list of files\n");
exit (0);
}
/* 'bash' parent closes both ends of the pipe within itself */
close (pipefd[0]);
close (pipefd[1]);
wait (NULL);
|
This example also illustrates one subtle aspect of the closing of pipes. When a process closes one
end of the pipe, it is only closing its access to that end of the pipe. Other processes may still
use that end of the pipe. For instance, in the previous example, observe that the parent process
closes both pipefd[0]
and pipefd[1]
. This does not affect the pipe itself, and it does not
prevent the two child processes from communicating via the pipe. All these two lines of code do are
close the parent’s (bash
’s) access to the pipe, preventing it from reading or writing to the
pipe. The pipe will remain in existence until all processes with access either close both ends of
the pipe or exit (which closes all open file connections).
The pipes described above create a simple mechanism for parent and child processes to communicate,
but they cannot be used for unrelated processes. Specifically, notice that the call to pipe()
must happen within the same program that later calls fork()
. That is, the pipe is first created
within a single process and is only shared with processes created as children (or children of
children, if the child also calls fork()
). This approach will not work for two random processes
that need to create an ad hoc communication session. FIFOs (first-in, first-out)
are a variation on a pipe that creates a more flexible communication structure.
FIFOs work by attaching a filename to the pipe. For this reason, FIFOs are also called named pipes
as opposed to the anonymous pipes discussed previously. FIFOs are created by one process that
calls mkfifo()
. Once created, any process (with correct access permissions) can access the FIFO
by calling open()
on the associated filename. Once the processes have opened the file, they can
use the standard read()
and write()
functions to communicate.
C library functions – <sys/stat.h>
int mkfifo (const char *pathname, mode_t mode);
A common use for FIFOs is to create client/server applications on the same machine. For example,
consider an anti-virus server that runs in the background, scanning for corrupted files. When the
system administrator wants to get a report on potentially bad files, they run a client application
that uses a FIFO to initiate contact with the server. Both the server and the client application are
distinct processes that are running separate programs. That is, neither process was created by
either of them calling fork()
. As such, an anonymous pipe()
call would not work. Instead,
both processes use the name attached to the FIFO to set up the communication.
As a simple scenario, consider a server that prints hello whenever a client writes a non-zero value to a file and shuts down when the client writes a zero. Code Listing 3.4 shows the structure of the server. The server starts by creating the FIFO with read and write permissions for the current user. Then, the server opens the FIFO in read-only mode and enters the listening loop. Once the server reads a value of 0 from the FIFO, it exits the loop, then closes and deletes the FIFO.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | /* Code Listing 3.4:
The basic structure of a server process using a FIFO
*/
/* Create the FIFO or die trying */
const char *FIFO = "/tmp/MY_FIFO";
assert (mkfifo (FIFO, S_IRUSR | S_IWUSR) == 0);
/* Try to open the FIFO. Delete FIFO if open() fails */
int fifo = open (FIFO, O_RDONLY);
if (fifo == -1)
{
fprintf (stderr, "Failed to open FIFO\n");
unlink (FIFO);
return 1;
}
/* Main server loop */
while (1)
{
int req = 0;
if (read (fifo, &req, sizeof (int)) != sizeof (int))
continue;
/* If we read a 0, quit; otherwise print hello */
if (req == 0)
break;
printf ("hello\n");
}
/* Read a 0 from the FIFO, so close and delete the FIFO */
close (fifo);
printf ("Deleting FIFO\n");
unlink (FIFO);
|
Code Listing 3.5 shows a sample client. This client opens the FIFO, then writes a sequence of integers (5 down to 0) into the FIFO. Note that anything the client writes after the 0 would be thrown away, as the server would delete the FIFO at that point.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | /* Code Listing 3.5:
A client process that sends six messages to the server in Code Listing 3.4
*/
const char *FIFO = "/tmp/MY_FIFO";
/* Use the file name to open the FIFO for writing */
int fifo = open (FIFO, O_WRONLY);
assert (fifo != -1);
/* Open the FIFO 6 times, writing an int each time */
for (int index = 5; index >= 0; index--)
{
/* Write 5, 4, 3, 2, 1, 0 into the FIFO */
int msg = index;
write (fifo, &msg, sizeof (int));
/* Add a slight delay each time */
sleep (1);
}
/* Close the FIFO */
close (fifo);
|
Although FIFOs use standard file I/O functions (e.g., open()
, read()
, and write()
), it
is important to note that they are not regular files. Specifically, once data has been read from a
FIFO, the data is discarded and cannot be read again (just like an anonymous pipe). In contrast,
with a regular file, multiple processes can read the same data from the same file. That is, regular
files store data persistently, but FIFOs do not. Consequently, FIFOs cannot be used for broadcasting
a single message to multiple recipients; only one process can read the data. Similarly, FIFOs (like
pipes) are not suitable for bi-directional communication; if a process writes into the FIFO then
immediately tries to read a response, it may read its own message!
Also similar to anonymous pipes, FIFOs use blocking I/O until both ends are opened by at least
one process. As such, there is no concern about a process writing into the FIFO too soon; if no
process has opened the FIFO for reading, the writing process will block until a reader becomes
available. This behavior may be problematic if the writing process needs to perform some other task.
To resolve this problem, pass the O_NONBLOCK
option during the call to open()
to make the
FIFO access non-blocking.