The POSIX thread library is a powerful and robust mechanism for concurrent systems programming. However, the library places a significant burden on the programmer to ensure that the implementation avoids race conditions and other bugs. In Synchronization Problems, we will examine common patterns that emerge in these types of programs and how to avoid subtle errors.
Since threads were first introduced, language designers have explored a number of techniques that reduce the complexity and responsibility of managing threads. This section will examine three different approaches for making multithreading easier. The first approach is a general style called implicit threading which aims to hide the management of threads as much as possible. The second approach is to treat threads as objects in languages like Java and Python. The third (and most modern) approach is to design the language around the concept of concurrency as a fundamental feature.
Implicit threading is the use of libraries or other language support to hide the
management of threads. In the context of C, the most common implicit threading
library is OpenMP. OpenMP uses the #pragma
compiler directive to
detect and insert additional library code at compile time. As an example,
consider the prime number calculator from the Extended Examples.
Code Listing 6.17 shows the OpenMP equivalent.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | /* Code Listing 6.17:
OpenMP can make some multithreading trivial to implement
*/
#include <stdio.h>
#include <stdbool.h>
#include <omp.h>
int
main (int argc, char *argv[])
{
/* Set up the overall algorithm parameters */
unsigned long end = 100000000L;
unsigned long iter = 0;
unsigned long count = 0;
/* OpenMP parallel for-loop with reduction on count. Each thread
will have its own count, but they'll be combined when all
threads are done. */
#pragma omp parallel for default(shared) private(iter) \
reduction(+:count)
for (unsigned long alue = 2; value < end; value++)
{
bool is_prime = true;
for (iter = 2; iter * iter <= value && is_prime; iter++)
if (value % iter == 0)
is_prime = false;
if (is_prime)
count++;
}
printf ("Total number of primes less than %ld: %ld\n", end,
count);
return 0;
}
|
With implicit threading, the focus of the programmer is on writing the algorithm
rather than the multithreading. The OpenMP library itself takes care of managing
the threads. Specifically, the #pragma
line indicates that OpenMP (omp
)
should parallelize a for-loop (parallel for
) with some constraints on the
variables. The OpenMP implementation on that system will then inject code to
perform the thread creation and join.
OpenMP in C is built on top of the pthread library. As such, any code that can be written using OpenMP can be converted into a more verbose pthread equivalent. However, the disadvantage of OpenMP is that it only works for certain types of tasks. There are many types of programs, such as the keyboard listener example, that can be implemented in pthreads but not OpenMP.
In other languages, traditional object-oriented languages provide explicit multithreading support with threads as objects. In these types of languages, classes are written to either extend a thread class or implement a corresponding interface. This style resembles the pthread approach, as the code is written with explicit thread management. However, the encapsulation of data within the classes and additional synchronization features simplify the task.
Java provides both a Thread
class and a Runnable
interface that can be
used, as shown in Code Listing 6.18. Both require implementing a
public void run()
method that defines the entry point of the thread. Once an
instance of the object is allocated, the thread can be started by invoking the
start()
method on it. As with pthreads, starting the thread is asynchronous,
so the timing of the execution is nondeterministic.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | /* Code Listing 6.18:
Java offers both a class and an interface to support threads as objects
*/
class ThreadExtender extends Thread {
public void run() {
/* Implement code here */
}
}
class RunImplement implements Runnable {
public void run() {
/* Implement code here */
}
}
public static void main(String args[]) {
/* Instantiate the ThreadExtender and start it */
ThreadExtender te = new ThreadExtender();
te.start();
/* Instantiate the Runnable version and start it */
RunImplement ri = new RunImplement();
Thread t = new Thread(ri);
t.start();
/* Join the threads */
te.join();
j.join();
}
|
Code Listing 6.19 demonstrates two mechanisms for multithreading in
Python. One approach is similar to the pthread style, where a function name is
passed to a library method thread.start_new_thread()
. This approach is very
limited and lacks the ability to join or terminate the thread after it starts. A
more flexible technique is to use the threading module to define a class that
extends threading. Similar to the Java approach, the class must have a run()
method that provides the thread’s entry point. Once an object is instantiated
from this class, it can be explicitly started and joined later.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | # Code Listing 6.19:
# Python offers a pthread-like interface and an object-oriented module for threading */
#!/usr/bin/python
import thread
import threading
import time
# Low-level procedural approach with the thread module
def proc_thread (name):
# thread code here
print "Ran thread " + name
# Use thread module to create and start the thread
try:
thread.start_new_thread(proc_thread, ("First Thread", ))
except:
print "Failed to start thread"
# OOP approach using the threading module
class ObjThread (threading.Thread):
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run(self):
# Thread entry point here
print "Running object thread " + self.name
# Create an instance of the object to start and join
threadObj = ObjThread("Second Thread")
threadObj.start()
threadObj.join()
# Delay to make sure both run
time.sleep(2)
|
Languages such as C, Java, and Python were all designed before multicore architectures rose to prominence in the early 2000s. As such, multithreading support in these languages was added as a supplement to the language, rather than a core feature. These languages were originally designed for a uniprocessing procedural or object-oriented paradigm. As a result, the memory models that underlie these languages are not adequate to prevent race conditions. The multithreading libraries had to provide additional features that allowed programmers to synchronize access to shared data. Or, put another way, programmers were forced to do extra work to make their programs work correctly.
Newer programming languages have avoided this problem by building assumptions of concurrent execution directly into the language design itself. For instance, Go combines a trivial implicit threading technique (goroutines) with channels, a well-defined form of message-passing communication. Rust adopts an explicit threading approach similar to pthreads. However, Rust has very strong memory protections that require no additional work by the programmer.
The Go language includes a trivial mechanism for implicit threading: place the
keyword go before a function call. In Code Listing 6.20, the line
go keyboard_listener(messages)
launches a new thread that will execute the
keyboard listener function. The new thread is passed a connection to a
message-passing channel. Then, the main thread calls
success := <-messages
, which performs a blocking read on the channel. Once
the user has entered the correct guess of 7, the keyboard listener thread writes
to the channel, allowing the main thread to progress.
Channels and goroutines are core parts of the Go language, which was designed under the assumption that most programs would be multithreaded. This design choice streamlines the development model, allowing the language itself to bear the responsibility for managing the threads and scheduling.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | /* Code Listing 6.20:
Go provides built-in concurrency techniques with goroutines and channels for communication
*/
package main
import (
"bufio"
"fmt"
"os"
"strconv"
"strings"
)
/* Main program entry point */
func main() {
/* Create a channel for communication */
messages := make(chan string)
fmt.Print("Guess a number between 1 and 10: ")
/* Start keyboard listener as a goroutine with the channel */
go keyboard_listener(messages)
/* Wait until there is data in the channel */
success := <-messages
if success == "true" {
fmt.Println("You must have guess 7.")
}
}
/* Define the keyboard listener thread with channel back to main */
func keyboard_listener(messages chan string) {
stdin := bufio.NewReader(os.Stdin)
/* Loop forever, reading keyboard input */
for {
text, _ := stdin.ReadString('\n')
/* Try to convert the input text to an int 7 */
value, err :=
strconv.ParseInt(strings.Trim(text, "\n"), 10, 32)
if err == nil {
if value == 7 {
/* Success. Send a message back through
the channel and exit */
messages <- "true"
return
}
}
fmt.Print("Wrong. Try again. ")
}
}
|
Rust is another language that has been created in recent years, with concurrency
as a central design feature. Code Listing 6.21 illustrates the use
of thread::spawn()
to create a new thread, which can later be joined by
invoking join()
on it. The argument to thread::spawn()
beginning at the
||
is known as a closure, which can be thought of as an anonymous function.
That is, the child thread here will print the value of x
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | /* Code Listing 6.21:
Rust threads can be created with anonymous functions known as closure
*/
use std::thread;
fn main() {
/* Initialize a mutable variable x to 10 */
let mut x = 10;
/* Spawn a new thread */
let child_thread = thread::spawn(move || {
/* Make the thread sleep for one second, then print x */
x -= 1;
println!("x = {}", x)
});
/* Change x in the main thread and print it */
x += 1;
println!("x = {}", x);
/* Join the thread and print x again */
child_thread.join();
}
|
However, there is a subtle point in this code that is central to Rust’s design.
Within the new thread (executing the code in the closure), the x
variable is
distinct from the x
in other parts of this code. Rust enforces a very strict
memory model (known as ownership) which prevents multiple threads from
accessing the same memory. In this example, the move keyword indicates that the
spawned thread will receive a separate copy of x
for its own use. Regardless
of the scheduling of the two threads, the main and child threads cannot
interfere with each other’s modifications of x
, because they are distinct
copies. It is impossible for the two threads to share access to the same memory.
In this small example, the issue of ownership may not seem to be a big deal. However, if you learn more about Rust and concurrency, you’ll quickly realize that it is. Ownership makes Rust very unique and makes it a very powerful language for concurrent programming. The crux is that ownership completely eliminates several types of race conditions, since it is impossible for multiple threads to share the same memory location. Furthermore, it achieves this memory safety without imposing any run-time performance penalty. Ownership constraints are checked and enforced at compile-time. This combination of memory safety and efficient performance gives Rust a significant advantage over other languages in regard to multithreading.