«  10.7. Strings   ::   Contents   ::   10.9. Files  »

10.8. Function Pointers

As we have seen, pointers have many uses in the C language. They can be used to provide indirect references to primitive types, form the basis of dynamically sized arrays, create instances of structs on demand, manipulate string data, and so on. One additional use of pointers is to create a reference to a function. That is, a function pointer is a variable that stores the address of a function. Readers who have previous experience with assembly language may recall that the name of a function is the same as a global label for an instruction. In other words, the name of a function is an alias for the address of the first instruction in the function’s body. A function pointer, then, stores this address.

To get started with function pointers, Code Listing A.49 defines two simple functions that we will use later. A key aspect of these functions is that they have the same signature. That is, the two functions take the same number of arguments, the types of the arguments are identical (the names of the parameters do not matter), and the return type is the same. In this example, both functions take two int arguments and return an int. Lines 6 and 7 are explicit function prototypes that declare the functions’ interfaces. Function prototypes are only required when a function is used before it is defined. For instance, if line 12 were changed to make a call to sub(), the compiler would use the prototype on line 7 to confirm that the call is correctly formatted; calling sub (1, 2) would be acceptable, while calling sub ("marine") would not, due to the required parameter types. Function prototypes can be omitted (and usually are) if the function definition occurs before the first call to it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
/* Code Listing A.49:
   Two simple arithmetic functions with the same function signature
 */

/* Function prototypes as they might appear in a header file */
int add (int, int);
int sub (int, int);

int
add (int addend_1, int addend_2)
{
  return addend_1 + addend_2;
}

int
sub (int minuend, int subtrahend)
{
  return minuend - subtrahend;
}

Code Listing A.50 uses the function declarations from A.49. Line 5 starts by declaring a function pointer variable called fun. The full explanation of function pointer declarations is provided below. For now, it is sufficient to note that fun is a pointer, so it can store an address. The names of the functions from Code Listing A.49, add and sub, are simply readable aliases for the addresses of the functions’ code. In other words, we can think of add and sub as address constants; setting fun = add is similar to setting fun = NULL to create a null pointer. Lines 5 and 8, then assign the addresses of add and sub to the pointer fun. The effect of this assignment is to make fun an alias for the function at that particular point. Lines 6 and 9 demonstrate that using a function pointer to make a function call works exactly like a standard function call. That is, the call to fun (5, 3) is identical to calling add (5, 3) directly.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
/* Code Listing A.50:
   Declaring a function pointer and assigning its value
 */

int (*fun) (int, int) = add;
int sum = fun (5, 3);

fun = sub;
int diff = fun (5, 3);

printf ("Sum is %d, difference is %d\n", sum, diff);

Function pointer declarations follow the structure defined in Figure 10.8.1. To understand the declaration, the components need to be read from the inside to the outside. That is, the middle portion, (*fun), is read first. Specifically, the * declares a pointer, just as any other pointer declaration. Wrapping the * and the variable name in parentheses indicates that it is a function pointer, in particular. Traversing outward, the rest of the line completes the function pointer’s type. The variable declaration starts with a single return type and ends with the list of parameter types in parentheses. In sum, a function pointer declaration looks exactly like a standard function prototype, with the exception of the (*name) structure where the prototype would just have the function name.

Components of a function pointer declaration

Figure 10.8.1: Components of a function pointer declaration

Decorative bug warning

Bug Warning


When working with function pointers, it is critical to understand the placement and exact role of parentheses. The (*name) structure is required to declare a variable instance. Similarly, the parameter list must also be in parentheses, just like standard function prototypes and definitions. One common mistake is to use parentheses on the right side of the assignment statement as shown below:

int (*ptr) (void) = trouble ();

With one exception, this line is incorrectly written. The declaration on the left indicates that there is a new function pointer variable, ptr. Any function that ptr can be set to must adhere to the interface that it accepts no arguments (specified by (void)) and returns an int. Given this declaration, the compiler will check to see if the value being assigned to ptr has the same type. That is, does the right side of the assignment evaluate to a pointer to a function, specifically one that takes no arguments and returns an int? Consider one possible declaration for the function called trouble():

1
2
3
4
5
6
7
8
int trouble (void); // trouble's function prototype

int 
trouble (void)
{
  /* Do something */
  return 5;
}

At first glance, everything looks okay. The prototype on line 1 and the definition of trouble() both indicate that the function takes no arguments and returns an int. However, that is exactly why the assignment above is incorrect. On the original line, we are not assigning the address of the function to ptr. Rather, the right hand of the assignment calls trouble(), which runs and returns the int value 5. As such, the line above is equivalent to setting ptr = 5; this is assigning an int value to a pointer variable, not assigning an address!

The exception to this description is if trouble() was specifically a function that returned another function. That is, trouble() is a function that takes no arguments and returns a pointer to a function, such that the returned function takes no arguments and returns an int. The syntax for this version of trouble() is, frankly, atrocious and unreadable. We provide it here for the curious, though we do not recommend ever writing code like this. If it is necessary to return a function pointer, typedefs can make this code significantly more readable.

1
2
3
4
5
6
7
8
/* The full prototype to be precise */
int (*trouble (void)) (void);

int
(*trouble (void)) (void)
{
  /* return a function pointer */
}

Perhaps the most confusing aspect of this declaration is which parameter list goes with which function. Counterintuitively, the inner portion defines the input parameters for trouble(). That is, the portion that reads (*trouble (void)) indicates that trouble() takes no parameters; the portions outside of these parentheses indicate the return type and parameter list for the function that trouble() returns.

10.8.1. Passing Function Pointers as Arguments

Given the complexity of declaring and working with function pointers, it is natural to ask why they are beneficial or even necessary. The two most common uses are to pass a function as an argument to another function and to build a lookup table of functions. We will start with the first example. This application might seem familiar to readers who are familiar with functional programming or the concept of lambdas that has been added to languages like Java. Code Listing A.51 provides an example of this kind of usage, using the (much maligned) Bubble Sort algorithm.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
/* Code Listing A.51:
   Bubble sort implementation that takes a custom comparison operation
 */

void
bubble_sort (void *array[], size_t length, int (*compare) (void *, void *))
{
  for (size_t i = 0; i < length - 1; i++)
    for (size_t j = 0; j < length - i - 1; j++)
      if (compare (array[j], array[j+1]) > 0)
        {
          void *temp = array[j];
          array[j] = array[j+1];
          array[j+1] = temp;
        }
}

The bubble_sort() function takes three parameters: an array of pointers, an array length, and a comparison function. As the array parameter is an array of void* elements, this implementation can sort any type of objects. That is, the array could contain pointers to strings, pointers to FILE objects (e.g., to sort based on file size), or pointers to any custom data types. The compare function pointer must point to a function that takes two void* inputs and returns an int. The type cannot indicate this, but any function pointer passed as the compare parameter needs to adhere to a convention based on strcmp(); return 1 if the first item is “greater” than the second, -1 if the second item is “greater,” and 0 if they match. Code Listing A.52 provides an example of such a comparison that performs a case-insensitive string comparison (as opposed to the case-sensitive nature of strcmp()).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/* Code Listing A.52:
   A case-insensitive string comparison
 */

int
strcmp_nocase (char *first, char *second)
{
  /* Assume non-empty strings for simplicity */
  char *first_walk = first;
  char *second_walk = second;

  /* Walk until one of the strings reaches a null byte */
  while (*first_walk != '\0' && *second_walk != '\0')
    {
      /* Make a lower-case copy of the characters and compare */
      int fc = tolower (*first_walk);
      int sc = tolower (*second_walk);
      if (fc > sc)
        return 1;
      if (fc < sc)
        return -1;

      /* Continue to the next character if the current matches */
      first_walk++;
      second_walk++;
    }

  if (*first_walk == *second_walk) // Strings are identical
    return 0;

  if (*first_walk == '\0') // First was shorter than second
    return -1;

  return 1; // First was longer than second
}

Code Listing A.53 applies these two functions to demonstrate the power of function pointers for flexible code execution. Line 5 starts with an array of strings with a mixture of cases. Line 7 passes this array to bubble_sort() along with a pointer to the strcmp_nocase() function. To be precise, the type of strcmp_nocase() does not match the type required for the compare parameter to bubble_sort(); strcmp_nocase() takes two char* arguments, whereas the type declaration of compare indicates that the function needs to take two void* arguments. Hence, line 7 explicitly casts strcmp_nocase() as needed. Similar casting is needed at line 14 for strcmp().

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
/* Code Listing A.53:
   Passing two functions as parameters to the generic Bubble Sort
 */

char *strings[] = { "Hello", "goodbye", "Hi", "HIT", "ha" };

bubble_sort ((void **)strings, 5, (int (*)(void *, void *))strcmp_nocase);

printf ("SORTED (case-insensitive):");
for (size_t i = 0; i < 5; i++)
  printf (" %s", strings[i]);
printf ("\n");

bubble_sort ((void **)strings, 5, (int (*)(void *, void *))strcmp);

printf ("SORTED (case-sensitive):");
for (size_t i = 0; i < 5; i++)
  printf (" %s", strings[i]);
printf ("\n");

Although the preceding code works, the number of parentheses and * operators can make it hard to read. One common way to improve the readability is to use a typedef to simplify the type declarations and function calls. Code Listing A.54 demonstrates this practice by defining a new type, comp_t, as a function pointer that takes two void* parameters and returns an int. Readers who are comfortable with typedef may fine the version on line 5 to be odd; normally, the new type name appears at the end of the line, just before the ; character. With function pointers, however, the type name is placed where the variable name would go in a function pointer declaration. Despite this oddity, line 8 is now easier to read than in Code Listing A.51; the compare parameter now matches the comp_t type, and the reader does not need to parse the complexities of function pointer declarations. Code Listing A.55 shows how the typedef simplifies the calls to bubble_sort() with cleaner casting.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
/* Code Listing A.54:
   Simplifying the bubble_sort() interface with a typedef
 */

typedef int (*comp_t) (void *, void *);

void
bubble_sort (void *array[], size_t length, comp_t compare)
{
  /* ... omitting the rest ... */
1
2
3
4
5
6
7
8
9
/* Code Listing A.55:
   Using the simplified bubble_sort() interface
 */

/* Replacement for A.53, line 7 */
bubble_sort ((void **)strings, 5, (comp_t)strcmp_nocase);

/* Replacement for A.53, line 14 */
bubble_sort ((void **)strings, 5, (comp_t)strcmp);

10.8.2. Function Pointer Lookup Tables

Another common use for function pointers is to create a lookup table of functions. That is, rather than building complex logic structures in code to determine which function needs to be called under certain conditions, this information is represented in a table format, such as a one- or two-dimensional array. To illustrate this concept, assume that Code Listing A.49 has been extended to support the five basic arithmetic operations (addition, subtraction, multiplication, division, modulus). Using these functions, Code Listing A.56 builds a single interface, calculate(), that can be used for any of them.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
/* Code Listing A.56:
   Using the simplified bubble_sort() interface
 */

typedef enum ops { ADD, SUB, MUL, DIV, MOD, NOP } op_t;
typedef int (*arith_t) (int, int);

int
calculate (op_t operation, int x, int y)
{
  static const arith_t ops[] = { add, sub, mul, div, mod };
  
  if (operation >= NOP || operation < 0)
    {
      errno = EINVAL;
      return 0;
    }

  return ops[operation] (x, y);
}

Lines 5 and 6 start by defining custom types for the valid operations (op_t) and the binary arithmetic funtions (arith_t). Line 11 defines the lookup table; it is a simple array of the five function pointers. (Again, note that these entries are just the names of the functions and do not have parentheses.) This line reiterates the value of using a typedef for function pointers, as the syntax to declare an array of function pointers, along with the static and const keywords, is unnecessarily cumbersome. Defining the enum on line 5 and performing the check on line 13 ensures that the operation is one of the five supported. Line 19, then, actually performs the operation. By indexing into the ops array, ops[operation] evaluates to a pointer to the specified function. The (x, y) notation after that indicates that this function is to be executed with x and y as the arguments. This calculate() function takes no special steps based on which function is being called (although it would be appropriate to prevent the use of 0 for y with div and mod); rather, the logic of determining the function lies entirely in the lookup table structure. Code Listing A.57 uses this simplified interface to perform each of the specified operations.

1
2
3
4
5
6
7
8
9
/* Code Listing A.57:
   Using the single calculate() interface
 */

printf ("SUM: %d\n", calculate (ADD, 5, 3));
printf ("DIF: %d\n", calculate (SUB, 5, 3));
printf ("PRO: %d\n", calculate (MUL, 5, 3));
printf ("QUO: %d\n", calculate (DIV, 5, 3));
printf ("REM: %d\n", calculate (MOD, 5, 3));
Decorative note icon

Note


Based just on the toy example of five arithmetic operations, it is fair to object to the need for tables of function pointers. Clearly, Code Listing A.57 could simply call add (5, 3) instead of calling calculate (ADD, 5, 3) to get the same result. From this example, the benefits of function pointers might not be evident. It is important to note, though, that function pointers are a difficult topic; this example was designed to be as simple as possible to explain the mechanics.

These lookup tables are widely used in a variety of fields, including compiler design. As the parser (one component of a compiler) reads the characters of a file a byte at a time, it needs to keep track of what has been read and what was just read. For instance, a C compiler would need to distinguish reading a space after "int" and reading a space after "int_value", as the former is a language keyword and the latter could be a variable name. Trying to write standard logic code for this processing is extremely difficult due to the sheer number of possible branches. Instead, a two-dimensional lookup table (a very large one!) provides a more manageable approach. The details vary, but a simple way to envision it is that each row indicates a state that represents what has been read (i.e., "i", "in", and "int” would be different states); the columns would then be used to identify a different function pointer for each possible input character.

«  10.7. Strings   ::   Contents   ::   10.9. Files  »

Contact Us License