Semaphores provide a general mechanism that allows one thread to indicate that a particular event has happened. One type of event that arises quite commonly in concurrent applications is the need to make sure all threads reach some common point before any of them continues on. That is, multiple threads may perform some initial computation then pause; at that point some manager thread checks these initial results and allows the threads to continue once their results have been approved.

As an example, consider the case of an autonomous car or flying drone. These types of systems require accurate measurements of real-world phenomena like speed, position, and acceleration. However, these measurements involve floating-point calculations that may encounter rounding errors. To improve the accuracy of the vehicle’s autonomous controls, multiple threads may be simultaneously and independently performing these calculations. The manager control thread will occasionally require these threads to reach a common point in time where each thread reports on its best guess of the measurements. The manager may then correct errors in some threads, then allow them to continue.

Achieving this kind of synchronization with semaphores is complex and error-prone. One approach would be to use two semaphores: the manager thread would repeatedly wait on one of them as the threads were finishing their calculations; the manager would then signal on a different semaphore that each thread was waiting on. However, if all threads agree but one crashes, then the entire system would fail; the manager thread would continue to wait on that last thread to finish (which will never happen).

As an alternative, synchronization barriers
allow programs to specify a *minimum* number of threads that must reach a common
point before any can continue. For instance, if the autonomous vehicle manager
thread receives reports from 7 out of 10 threads about the current position,
then the system may be designed to consider this measurement *good enough* and
let the threads continue rather than requiring the last three threads to
finish.

The base functionality of barriers is specified with thread functions. The
barrier is initialized with `pthread_barrier_init()`

, with the last parameter
indicating the *minimum* number of threads that must reach the barrier before
any can be allowed to continue. Threads use the `pthread_barrier_wait()`

function to indicate that they have reached the common point. If the minimum
number of threads have not yet reached the barrier, then the current thread will
become blocked. Once the minimum number has been reached, all threads waiting at
the barrier become unblocked; any future thread that calls
`pthread_barrier_wait()`

will pass right through the barrier without any
delay. Finally, `pthread_barrier_destroy()`

is used to clean up any resources
associated with the barrier.

C library functions – <pthread.h>

`int pthread_barrier_init (pthread_barrier_t *barrier, const pthread_barrierattr_t *attr, unsigned count);`

- Initialize a synchronization barrier with the specified attributes.
`int pthread_barrier_destroy (pthread_barrier_t *barrier);`

- Delete a synchronization barrier.
`int pthread_barrier_wait (pthread_barrier_t *barrier);`

- Make a thread wait until enough have reached the barrier.

To show how barriers work, consider using two threads to determine which function grows faster: the Fibonnacci sequence or the exponential function. That is, is the 100th Fibonacci number greater than $c^{100}$ for some constant $c$? [1] It can be shown that each Fibonacci number is slightly less than double the previous value, so the answer would be no for any constant integer greater than 1. But what if the constant is 1.6? [2] Then the answer is not obvious, but it is easy enough to calculate.

Code Listing 7.13 shows how barriers can address this problem. The
`fibonacci()`

and `exponential()`

threads take a struct that contains a
shared `pthread_barrier_t`

, a digit that specifies how far into the sequence
and what exponent to calculate, and a base value that specifies the base of the
exponent. That is, if digit is specified as 30 and base is 1.6, `fibonacci()`

will calculate the 30th Fibonacci number, while `exponent()`

will calculate
(approximately) $1.6^{30}$.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | ```
/* Code Listing 7.13:
Using a barrier to pause multiple threads
*/
/* Use a common struct with the barrier and parameters */
struct args {
pthread_barrier_t *barrier;
unsigned int digit;
double base;
unsigned long fib;
unsigned long exp;
};
/* Iterative version of the Fibonacci calculation */
void *
fibonacci (void *_args)
{
struct args *args = (struct args *) _args;
unsigned long previous = 1;
unsigned long current = 1;
unsigned int i;
for (i = 2; i < args->digit; i++)
{
current += previous;
previous = current - previous;
}
/* current is the i-th Fibonacci number */
args->fib = current;
/* Wait at the barrier and compare results */
pthread_barrier_wait (args->barrier);
if (args->fib > args->exp)
printf ("Fibonacci wins: %lu > %lu\n", args->fib, args->exp);
pthread_exit (NULL);
}
/* Calculate base^digit using repeated multiplication */
void *
exponential (void *_args)
{
struct args *args = (struct args *) _args;
double result = 1;
unsigned int i;
/* WARNING: Don't actually do this in real code that needs
Accurate calculations. Repeated floating point arithmetic
inherently has rounding errors that will compound. */
/* Calculate base^digit; cast to unsigned long for comparison */
for (i = 1; i < args->digit; i++)
result *= args->base;
args->exp = (unsigned long) result;
/* Wait on threads to reach the barrier and check the result */
pthread_barrier_wait (args->barrier);
if (args->exp > args->fib)
printf ("Exponential wins: %lu > %lu\n", args->exp, args->fib);
pthread_exit (NULL);
}
``` |

Bug Warning

As indicated in the code, you should not try to calculate the exponential value as shown in this code. For one thing, this code is (intentionally) inefficient and could be easily optimized. More importantly, though, the result is guaranteed to be inaccurate. Floating-point arithmetic is susceptible to rounding errors, particularly with repeated calculations and large values. Consequently, this example should primarily be considered for its use of barriers rather than its accuracy regarding the comparison of Fibonacci and exponential growth.

The key lines to note regarding these threads are the calls to
`pthread_barrier_wait()`

and the references to the fields `fib`

and `exp`

,
which store the results of each calculation. Specifically, note that there is no
race condition regarding these fields because the threads only *read* the values
after the barrier. That is, nothing changes after the barrier, so there is no
concern that the values checked are inaccurate. This guarantee arises from the
fact that both threads must reach the barrier (i.e., the
`pthread_barrier_wait()`

call) before either can proceed.

One question that cannot be answered with this code is which thread runs faster. With barriers, all we can say for sure is that both threads reach the barrier before either proceeds. We have no information about which one arrived first. From the perspective of the barrier, such a question is irrelevant. The only thing that matters is that both finish before they each check the results.

Code Listing 7.14 shows the initialization required. The main
thread initializes the barrier so that 2 threads are required, then creates the
threads. Notice that both threads receive pointers to the same `struct`

instance, so they share the same `barrier`

, `fib`

, and `exp`

fields. The
main thread joins both of the helper threads, but the join must occur after the
threads have called `pthread_exit()`

. Since both threads call
`pthread_barrier_wait()`

before that, the main thread is guaranteed to clean
up the barrier only *after* it is no longer needed.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ```
/* Code Listing 7.14:
Initializing the barrier and data structures for Code Listing 7.13
*/
/* Initialize the barrier so that both threads must wait */
pthread_barrier_t barrier;
pthread_barrier_init (&barrier, NULL, 2);
/* Set up parameters to get the 30th Fibonacci and 1.6^30 */
struct args args;
args.barrier = &barrier;
args.digit = 30;
args.base = 1.6;
args.fib = 0;
args.exp = 0;
/* Create and join the threads, then destroy the barrier */
assert (pthread_create (&threads[0], NULL, fibonacci, &args) == 0);
assert (pthread_create (&threads[1], NULL, exponential, &args) == 0);
pthread_join (threads[0], NULL);
pthread_join (threads[1], NULL);
pthread_barrier_destroy (&barrier);
``` |

[1] | To be clear, the `exponential()` function is truly the exponential
$c^n$ rather than the polynomial $n^c$. In both `exponential()` and
`fibonacci(),` we are using $n = 100$. |

[2] | The selection of 1.6 as the value was not an arbitrary choice. There is a closed-form solution known as Binet’s formula, and calculations based on this formula can be done to show that each Fibonacci number is approximately 1.6 times the value of the previous number. |