4.4. The Socket Interface¶

The socket interface in C provides a mechanism for setting up a communication channel to another host system. For both clients and servers, the initial function call is the same. Processes call socket() to request a new socket instance from the OS. As with other forms of IPC such as pipes, sockets are treated as files, so the process receives a file descriptor from the return value. If the socket creation failed, the OS returns a negative value.

C library functions – <sys/socket.h>

int socket (int domain, int type, int protocol);: Create a socket instance.

The domain field is used to declare the intended scope of routing needed; different values here indicate whether the socket will be used for IPv4, IPv6, or local communication. The type field determines whether the socket will read and write data as a byte stream, fixed-size messages, or as unprocessed (raw) data. The protocol field is typically unused and set to 0; one exception occurs when the client is acting as a packet sniffer, an application for capturing and examining packets sent by other processes on a host or network. These applications use raw sockets, as described below. For all three fields, the socket header file defines constant values that are used. Table 4.2 identifies several common constants.

Field	Constant	Purpose
`domain`	`AF_INET`	Use IPv4 addresses
	`AF_INET6`	Use IPv6 addresses
	`AF_LOCAL`	Unix domain socket for IPC
	`AF_NETLINK`	Netlink socket for kernel messages
	`AF_PACKET`	Raw socket type
`type`	`SOCK_STREAM`	Byte-stream communication, used for TCP transport
	`SOCK_DGRAM`	Fixed-size messages, used for UDP transport
	`SOCK_RAW`	Raw data that is not processed by transport layer
`protocol`	`IPPROTO_RAW`	Receive IP datagrams without transport-layer processing
`protocol`	`ETH_P_ALL`	Receive Ethernet frames without network-layer processing

Table 4.2: Common arguments to the socket() function

The domain and type field constants are defined in the sys/socket.h header file. The domain fields listed here have another form that replaces the AF with PF. For example, there are also PF_INET and PF_PACKET constants. The original use of this different notation was to distinguish an address family (AF) from a protocol family (PF). In practice, these values tend to be identical, with the AF form more commonly used. For the protocol field, the IPPROTO_RAW and similar IPPROTO_* constants are defined in netinet/in.h. The ETH_P_ALL and similar constants are stored in linux/if_ether.h on Linux systems; other systems do not have these specific values.

Only certain combinations of these arguments make sense. For instance, a raw socket created with the domain AF_PACKET would use the SOCK_RAW type; a raw socket would not be set up to use the SOCK_STREAM type, which is commonly used for the stream-oriented TCP transport-layer protocol. Similarly, a socket created for IPv6 communication (AF_INET6) would not use the SOCK_RAW type, which would deliver the IP datagram payload directly to the process without the transport-layer processing; the process would then receive the full payload, including the TCP or UDP header fields. Code Listing 4.1 illustrates several common combinations of socket arguments.

/* Code Listing 4.1:
   Common argument types for various sockets
 */

/* Create an IPv4 socket for TCP */
socketfd = socket (AF_INET, SOCK_STREAM, 0);

/* Create an IPv6 socket for TCP */
socketfd = socket (AF_INET6, SOCK_STREAM, 0);

/* Create an IPv4 socket for UDP */
socketfd = socket (AF_INET, SOCK_DGRAM, 0);

/* Create an IPv6 socket for UDP */
socketfd = socket (AF_INET6, SOCK_DGRAM, 0);

/* Create a raw socket for sniffing unprocessed Ethernet frames */
socketfd = socket (AF_PACKET, SOCK_RAW, htons (ETH_P_ALL));

/* htons is explained below */

Processes running on two separate hosts normally intend to exchange application-layer messages, relying on the lower layers just as a mechanism to deliver the data. (Refer back to Figure 4.2.1.) However, some processes are designed to gather information about the network itself. One example of such a process would be a packet sniffer that captures and inspects data sent through the network. Another example is the traceroute utility, which can gather information about the routers that are encountered on the way to a target host. Both of these applications can be used to monitor the health and performance of a network, and they also form the basis of security tools, such as a network intrusion detection systems (network IDS).

Applications that gather information about the network rely on the use of raw sockets, which are sockets intended to break the layers of abstraction of the protocol stack. Using a raw socket, a process can examine a full IP datagram or TCP packet, including their respective headers. For instance, a raw socket with the protocol IPPROTO_RAW could be used to examine the TCP or UDP headers for a packet This reveals information about the processes that sent or are intended as the receiver of the data; however, this packet has still been processed by the Internet layer, so the IP headers would have been removed. To examine the IP headers, the ETH_P_ALL protocol field is needed.

Normal applications should not use raw sockets, because most of the information can be accessed in other ways. That is, when a client connects to a server, the client would already learn that host’s IP address and the port number associated with the server. Furthermore, using a raw socket would require the application programmer to understand details of the transport and network protocols to know how much data at the beginning of the message to ignore. Unless the application is specifically designed to gather information about the network itself, raw sockets are not needed in typical use.

4.4.1. Networking Data Structures¶

Once the socket is created, the client and server processes use different functions to establish the network link between their sockets. These functions rely on a common struct sockaddr data structure. The basic form of this structure contains two fields:

/* defined in sys/socket.h */
struct sockaddr {  /* generic socket address structure */
  sa_family_t sa_family;
  char sa_data[14];
};

The first field, sa_family, is two-bytes in size and identifies the domain of the socket that is being used. This field is set using AF_INET, AF_PACKET, and similar constants used to create the socket. The other field, sa_data, is an unstructured 14-byte sequence in the generic form. The interpretation of these bytes depends on the domain of the socket. For an IPv4 socket, the process would create a struct sockaddr_in [1] instead, which uses the following definition.

/* defined in netinet/in.h */
struct sockaddr_in {
  sa_family_t sin_family;
  in_port_t sin_port;
  struct in_addr sin_addr;
  char sin_zero[8];
};

struct in_addr {
  in_addr_t s_addr; /* in_addr_t is a typedef alias for uint32_t */
};

The sockaddr and sockaddr_in structures are identical in size, allowing for straightforward casting between the two. Both begin with a sa_family_t field to indicate the domain. As the type is the same, both structs use the same number of bytes for this information. The sockaddr_in breaks the rest of the bytes into three fields. The sin_port is a 16-bit field to designate the port number for the socket and the sin_addr contains the 32-bit IPv4 address. The struct in_addr contains a single field, s_addr, which is made an alias for a uint32_t by a typedef elsewhere in the C library. The sin_zero field of struct sockaddr_in is used to pad the size of the struct to match the size of the original sockaddr. [2]

Example 4.4.1

To illustrate the internal structure of these address structs, consider the following byte sequence:

struct sockaddr

sa_family sa_data

02 00 00 50 5d b8 d8 22 00 00 00 00 00 00 00 00

sin_family sin_port sin_addr sin_zero

struct sockaddr_in

In both cases, the first two bytes denote the family, and both types of structs interpret these bytes the same way. The value stored here is AF_INET (the constant 2), which indicates an IPv4 address. Code the checks this value as either type of struct would get the same answer for the family. If the address is interpreted as struct sockaddr_in, the three remaining fields (sin_port, sin_addr, and sin_zero) occupy 14 bytes, the same amount of space as the sa_data field in the struct sockaddr interpretation. In this particular example, the sin_port field contains the value 80 (0x50), whereas the sin_addr field contains the address 93.184.216.34 (0x5db8d822). Astute readers may notice there is a discrepancy in how the sin_family and sin_port fields are interpreted. This discrepancy is caused by the concept of endianness, which we will explain below.

IPv6 socket address structs are similar in some respects. The struct is renamed sockaddr_in6 and the fields are renamed to sin6_family, sin6_port, and sin6_addr. Two additional fields (sin6_flowinfo and sin6_scope_id) are also defined for behavior that exists in IPv6 but not in IPv4; these fields are used for specialized purposes that are beyond the scope of this book.

/* included by netinet/in.h */
struct sockaddr_in6 {
  sa_fmily_t sin6_family;
  in_port_t sin6_port;
  uint32_t sin6_flowinfo;
  struct in6_addr sin6_addr;  /* IPv6 addresses are 128-bit */
  uint32_t sin6_scope_id;
};

Despite their similar naming and ordering of fields, IPv6 socket address structs are considerably larger in size. To be precise, consider the type of sin_addr compared with sin6_addr. For IPv4, the struct in_addr type is an alias for uint32_t—an unsigned 32-bit (4-byte) integer. For IPv6, the struct in6_addr contains a union of three different types. For readers unfamiliar with C unions, the types are not distinct fields. Rather, the __u6_addr field of the struct contains 16 bytes, but it can be interpreted in multiple ways. As such, the sin6_addr field alone is the size of the entire sockaddr or sockaddr_in structs.

/* included by netinet/in.h */
struct in6_addr {
  union {
    uint8_t  __u6_addr8[16];  /* aliased as s6_addr */
    uint16_t __u6_addr16[8];  /* aliased as s6_addr16 */
    uint32_t __u6_addr32[4];  /* aliased as s6_addr32 */
  } __u6_addr;
};

The difference in the sin6_addr field size is not a problem in practice. As the first field (sin6_family) remains a constant size of 16 bits, this field can be used to determine which type of struct is being used. The address field can then be accessed by explicitly casting to struct sockaddr, struct sockaddr_in, or struct sockaddr_in6 as needed. In the code, the 16 bytes of the sin6_addr field can be viewed as an array of 16 uint8_t values, an array of eight uint16_t values, or an array of four uint32_t values; all three views refer to the same sequence of bytes, and they all require the same size. Given that the syntax of unions can be awkward, these fields are aliased as s6_addr, s6_addr16, and s6_addr32. Code Listing 4.2 shows how the alias makes the syntax cleaner to use.

/* Code Listing 4.2:
   Accessing part of a union without and with an alias for the field
 */

struct in6_addr addr;
addr.__u6_addr.__u6_addr[0] = 5; /* set first byte to 5 */
addr.s6_addr[0] = 5; /* same thing, but using the alias */

Example 4.4.2

The following sequence of bytes illustrates an example of an IPv6 address struct. The first observation to make is that this struct is 28 bytes in length, rather than the 16 bytes for IPv4 of the generic struct sockaddr. Due to this additional length, the bytes must be wrapped onto a second line to make the table readable.

struct sockaddr_in6

sin6_family sin6_port sin6_flowinfo sin6_addr

0a 00 00 50 00 00 00 00 26 06 28 00 02 20 00 01

sin6_addr (continued) sin6_scope_id

02 48 18 93 25 c8 19 46 00 00 00 00

As with sockaddr and sockaddr_in, the first two bytes can be read independently of the rest to determine the sin6_family field. The value 10 (0x0a) is the constant AF_INET6, which indicates code working with a pointer to this struct should case it as an IPv6 address. This example uses the same port (80) as Example 4.4.1. The IP address shown in the sin6_addr field here would be 2606:2800:220:1:248:1893:25C8:1946. Note that both of these addresses correspond to the same (as of this writing) URL: www.example.com. This server can be reached using either IPv4 or IPv6.

To illustrate the purpose of the struct in6_addr union, consider the following version of the IPv6 address from above. As before, we have written the address to wrap onto a second line for the purposes of the table structure.

`s6_addr32[0]`				`s6_addr32[1]`
`s6_addr16[0]`		`s6_addr16[1]`		`s6_addr16[2]`		`s6_addr16[3]`
`26`	`06`	`28`	`00`	`02`	`20`	`00`	`01`

`s6_addr32[2]`				`s6_addr32[3]`
`s6_addr16[4]`		`s6_addr16[5]`		`s6_addr16[6]`		`s6_addr16[7]`
`02`	`48`	`18`	`93`	`25`	`c8`	`19`	`46`

Each byte within the address can be accessed using the s6_addr array. If the address is stored in the variable addr as declared in Code Listing 4.2, addr.s6_addr[0] = 0x26, addr.s6_addr[1] = 0x06, and so on. However, the other types in this union allow the bytes to be grouped together as needed by the code. The groupings are shown by the labels for the s6_addr16 and s6_addr32 arrays above. That is, using the same variable declaration from Code Listing 4.2, addr.s6_addr16[0] = 0x2606 and addr.s6_addr16[1] = 0x2800, whereas addr.s6_addr32[0] = 0x26062800. The union type allows the programmer to use whichever interpretation is convenient.

The internal representations of in_addr and in6_addr are different from the standard notation used for IP addresses. For instance, readers may be familiar with the dotted decimal notation of IPv4 addresses, as illustrated by the loopback address 127.0.0.1 that refers to the local host machine. This format is used for human readability, but the actual IP address is stored as a 32-bit value, with each byte corresponding to one of the dotted fields. In the case of the loopback, the IP address is 0x7f000001 (recall that 0x7f is the same as the decimal number 127). Similarly, IPv6 addresses are typically formatted as a colon-delimited list of groups of four hexadecimal digits, such as 1122:3344:5566:7788:99aa:bbcc:ddee:ff00. Leading zeros can be compressed, and consecutive sections of zeros can be replaced with a double colon. As such, the IPv6 loopback address can be written as 0:0:0:0:0:0:0:1, or simply ::1. However, the internal representation is the full 16-byte sequence of values.

In both IPv4 and IPv6 the sin_port and sin6_port fields require special handling, due to the issue of endianness. Recall that multi-byte numbers, such as a 16-bit unsigned integer, can be stored according to either big endian or little endian format depending on the CPU architecture. In a big endian architecture, the most significant byte (i.e., the big end of the number) is placed at the lowest memory address; this relationship can be captured with the mnemonic, “big end at the bottom.” In contrast, a little endian architecture would place the least significant byte at the lowest address; here, the mnemonic is “little end at the lowest.” As an illustration, consider Code Listing 4.3. By casting the address of the uint16_t variable—which would actually be the address of its byte at the lowest numerical address—as a uint8_t pointer, the two bytes can be accessed individually.

/* Code Listing 4.3:
   Casting and pointer arithmetic can illustrate endianness
 */

struct sockaddr_in address;
address.sin_port = htons (4096); /* see below */
/* Cast the address to access each byte individually */
uint8_t *as_array = (uint8_t *) &address.sin_port;
printf ("%p stores %" PRI8x "\n", &as_array[0], as_array[0]);
printf ("%p stores %" PRI8x "\n", &as_array[1], as_array[1]);

If we assume the address.sin_port field resides at address 0xbfff1234 (thus also occupying 0xbfff1235), the variable layout in memory would vary based on the CPU architecture:

Big Endian
Address	Value
`bfff1235`	`00`
`bfff1234`	`10`

Little Endian
Address	Value
`bfff1235`	`10`
`bfff1234`	`00`

Table 4.3: Placement of bytes in memory depends on CPU endianness

While most modern CPU architectures use a little endian format, network protocols use big endian. Furthermore, since the sender and receiver hosts may have different CPU architectures, they may also differ in their endianness. To overcome this problem, multi-byte socket address fields that will be sent across the network require attention to endianness. The simplest way to achieve this is to use the C functions htons(), htonl(), ntohs(), and ntohl(). When the process is setting up the socket or sending data, use the hton (host to network) versions; as an example, refer back to the last line of Code Listing 4.1. At the other end, when a process reads data from the network, it will use the ntoh (network to host) version. Assuming both hosts follow this convention, neither host will require any advance knowledge of the other host’s CPU architecture endianness.

C library functions – <arpa/inet.h>

uint32_t htonl (uint32_t hostlong);: Convert a 32-bit unsigned integer from host endian format to network endian format.
uint16_t htons (uint16_t hostshort);: Convert a 16-bit unsigned integer from host endian format to network endian format.
uint32_t ntohl (uint32_t netlong);: Convert a 32-bit unsigned integer from network endian format to host endian format.
uint16_t ntohs (uint16_t netshort);: Convert a 16-bit unsigned integer from network endian format to host endian format.

Bug Warning

Within the sockaddr and sockaddr_in structures, only fields that are intended to be sent across the network need to be formatted with the host to network functions. Other fields must not be formatted in this way, as they are only used by the local host machine; flipping the byte order of these values would produce incorrect results. In practice, only the port field needs to be formatted with htons(); IP addresses are generally not hard-coded and helper functions format these values transparently.

4.4.2. Client Socket Interface¶

Once a client process has created a socket, the next step is to build the socket address structure and to establish a connection with the socket at the server host. The client is primarily concerned with specifying the IP address of the server and the associated port number. In the case of a connection-less protocol like UDP, this step does not actually involve contacting the server; rather, this step just involves configuring the socket’s peer IP address.

While there are standard port numbers for many applications, IP addresses should not be hard-coded, as they can change. Instead, getaddrinfo() provides an interface to look up an IP address by the standard text format used in URIs. This string is passed as the first argument, nodename. The servname parameter indicates a desired service, such as "http". The hints parameter can be used to limit the list of results, such as restricting the domain to AF_INET (IPv4) or AF_INET6 (IPv6), or limiting the type to SOCK_STREAM or SOCK_DGRAM. The final parameter, res, is a call-by-reference parameter that will be set to point to a linked list of address structures.

C library functions – <netdb.h>

int getaddrinfo (const char *nodename, const char *servname, const struct addrinfo *hints, struct addrinfo **res);: Translate a human-readable hostname into an IP address, typically with the help of DNS.
void freeaddrinfo (struct addrinfo *ai);: Free all address information structures in the linked list beginning at ai.

Both the hints and res parameters use the struct addrinfo structure, defined in netdb.h. The hints argument initializes all fields to zero, setting the int fields as desired. For example, setting hints.ai_family to AF_INET would get results only for IPv4; to get all families, the value can be left as zero or explicitly set as AF_UNSPEC. Similarly, setting hints.ai_socktype to SOCK_STREAM would yield only byte stream sockets (e.g., those used in TCP). In the results list, the ai_addr field would point to a struct sockaddr as defined previously. Note that getaddrinfo() puts all values into the struct in the correct endianness needed for later functions, so no additional conversion is needed. Since getaddrinfo() dynamically allocates the results list, the res field should be passed to freeaddrinfo() to free the memory.

/* defined in netdb.h */
struct addrinfo {
  int ai_flags;
  int ai_family;
  int ai_socktype;
  int ai_protocol;
  socklen_t ai_addrlen;
  char *ai_canonname;
  struct sockaddr *ai_addr;
  struct addrinfo *ai_next;
};

When building applications to connect to a server, the ai_addr field can be passed without casting. However, it is often beneficial to print the address in a readable format, such as when writing to a log file. The inet_ntoa() and inet_ntop() functions perform this formatting. Of these two, the inet_ntoa() function is older and only supports IPv4. This function takes a struct in_addr parameter, which would be the sin_addr field of a struct sockaddr. The return value is a pointer to a statically allocated location containing the string. As such, the pointer does not (and cannot) need to be freed later. On the other hand, inet_ntop() works with both IPv4 and IPv6, relying on the af parameter to distinguish between the two. The src argument points to the sockaddr to translate; note that this parameter uses a void * type, and the function will cast it based on the af argument. The dst argument is a pointer to a buffer to write the string into, with size indicating the length of the buffer. If the address can be translated properly, the function returns a pointer to the string, which should match the address of the buffer.

C library functions – <arpa/inet.h>

char *inet_ntoa (struct in_addr in);: Convert an IPv4 address into a string using dotted decimal notation.
const char *inet_ntop (int af, const void *src, char *dst, socklen_t size);: Convert an IP address (either IPv4 or IPv6) into a string format.

Code Listing 4.4 illustrates how these functions can be used along with getaddrinfo() to translate a hostname (assumed to be declared with a value such as "www.example.com") into an IPv6 readable address format. The getaddrinfo() internal implementation typically relies on the Domain Name System (DNS) protocol to look up the IP address. In this initial version, each of the results in the server list are confirmed to be an IPv6 address. The ai_addr field is then cast to the appropriate socket address struct and its sin6_addr field is passed to the inet_ntop() function for formatting. This implementation could be modified to print IPv4 addresses correctly by changing only the constants and field names as described previously; the buffer size would need to use the constant INET_ADDRSTRLEN, as well.

/* Code Listing 4.4:
   Utility program that prints IPv6 addresses as a readable string
 */

struct addrinfo hints, *server_list = NULL, *server = NULL;
memset (&hints, 0, sizeof (hints));
hints.ai_family = AF_INET6;       /* change to AF_INET for IPv4 */
hints.ai_socktype = SOCK_STREAM;  /* limit to byte-streams */
hints.ai_protocol = IPPROTO_TCP;  /* create as a TCP socket */

/* Get a list of addresses at hostname that serve HTTP */
getaddrinfo (hostname, "http", &hints, &server_list);

/* Traverse through the linked list of results */
for (server = server_list; server != NULL; server = server->ai_next)
  {
    if (server->ai_family == AF_INET6)
      {
        /* Cast ai_addr to an IPv6 socket address */
        struct sockaddr_in6 *addr = (struct sockaddr_in6 *)server->ai_addr;

        /* Allocate a buffer to store the IPv6 string */
        char in6addr[INET6_ADDRSTRLEN];
        assert (inet_ntop (AF_INET6, &addr->sin6_addr, in6addr, sizeof (in6addr))
                != NULL);
        printf ("IPv6 address: %s\n\n", in6addr);
      }
  }
freeaddrinfo (server_list); /* Free allocated linked list data */

Code Listing 4.5 shows how the if-block from Code Listing 4.4 could use inet_ntoa() instead of inet_ntop(). For IPv4, the two functions produce identical output, so either one can be used.

/* Code Listing 4.5:
   Additional code for printing out an IPv4 in dotted decimal format
 */

if (server->ai_family == AF_INET)
  {
    /* Cast ai_addr to an IPv4 socket address */
    struct sockaddr_in *addr = (struct sockaddr_in *)server->ai_addr;
    printf ("IPv4 address: %s\n", inet_ntoa (addr->sin_addr));
  }

Code Listing 4.3 showed how the port field could be explicitly set with a variable assignment. While it may not be apparent, Code Listing 4.4 illustrates a second, preferred way to handle this assignment. Specifically, getaddrinfo() puts the port number in socket address based on standard, well-known ports. In this case, the parameter "http" maps to port number 80. Table 4.4 lists the port numbers for some common applications. Note that some services, such as FTP and Telnet, are discouraged because they offer no security guarantees; Telnet, for instance, allows anyone with a packet sniffer to discover a user’s password when logging in to a remote server. Instead, the secure approach is to run these protocols on top of SSH.

Port	Name	Service
21	FTP	Insecure file transfer
22	SSH	Secure shell
23	Telnet	Insecure remote access
25	SMTP	Email delivery
53	DNS	IP address lookup
67	DHCP	IP address assignment
68	DHCP	IP address assignment
80	HTTP	Web page
88	Kerberos	Authentication

Port	Name	Service
110	POP3	POP email access
123	NTP	Time synchronization
143	IMAP	IMAP email access
194	IRC	Internet chat service
389	LDAP	Authentication
443	HTTPS	Secure web page
530	RPC	Remote procedure call
631	IPP	Internet printing
993	IMAPS	Secure IMAP access

Table 4.4: List of common well-known ports

Example 4.4.3

The relationship between the structs can get confusing in code, due to the casting and pointer dereferences involved. To illustrate their connections, assume that getaddrinfo() returns a list with two nodes containing the addresses in Example 4.4.1 and Example 4.4.2. This list could be visualized as follows:

Visualization of a linked list of two struct addrinfo nodes

The key observertion is that the IP addresses (in the struct sockaddr_in and struct sockaddr_in6) are stored separately from the struct addrinfo nodes. Both nodes indicate the address is for a TCP server (indicated by ai_socktype = SOCK_STREAM and ai_protocol = IPPROTO_TCP). We can determine the type of address by checking ai_family, which is set to 2 (AF_INET) for IPv4 or 10 (AF_INET6) for IPv6. We could also determine this information indirectly by noting the ai_addrlen field that indicates the address length (4 for IPv4, 16 for IPv6).

In Code Listing 4.5, the server variable would point to the first struct addrinfo. Then the addr variable would point to the struct sockaddr_in shown just below it, allowing the call to inet_ntoa (addr->sin_addr) on the address. Code Listing 4.6 uses the next struct addrinfo containing the pointer to the IPv6 address. This pointer can then be passed to inet_ntop() to print the address.

Example 4.4.1 and Example 4.4.2 detailed the byte contents of the struct sockaddr_in and struct sockaddr_in6, respectively. We can examime the first struct addrinfo as shown in the following table. In this case, we assume that the struct sockaddr_in instance is at memory location 0x5581f7459560 (as indicated by the ai_addr pointer), while the other struct addrinfo is stored at 0x5581f7453040 (as indicated by the ai_next pointer).

struct addrinfo

ai_flags ai_family ai_socktype ai_protocol

00 00 00 00 02 00 00 00 01 00 00 00 06 00 00 00

ai_flags [PADDING] ai_addr

10 00 00 00 76 65 72 0a 60 95 45 f7 81 55 00 00

ai_canonname ai_next

00 00 00 00 00 00 00 00 40 30 45 f7 81 55 00 00

Note that these struct addrinfo instances form a linked list of possible address responses (each of which point to external struct sockaddr instances). In this example, since ai_next is NULL, there are no more struct addrinfo entries in the list.

Once the address information is established, it can be passed to the connect() function to establish the initial connection to the socket at a server address, so long as it is accepting requests. If TCP is the transport layer protocol used, connect() will send an initial message to the server host process to initiate the TCP 3-way handshake, making the server aware of the connection. If UDP or another connectionless protocol is used, connect() simply sets the IP address of the peer (i.e., the server) in the client host’s socket.

C library functions – <sys/socket.h>

int connect (int socket, const struct sockaddr *address, socklen_t address_len);: Connect to a server or set the peer address of a connectionless server socket.

Building on the previous examples, Code Listing 4.6 shows how the results from getaddrinfo() can be used to connect to the server. In this example, we are creating a TCP connection over IPv4 to the designated hostname. Based on the hints fields and the second parameter to getaddrinfo(), every address returned in the server_list will be configured to connect to a web server running HTTP. When the for-loop ends, the process is either connected to the server to start an HTTP session or there is no socket connection available. In the latter case, the socketfd would be -1, and the client should recognize the failed connection.

/* Code Listing 4.6:
   Client code that will connect to a web server for an HTTP session
 */

/* Declare an IPv4 socket for TCP */
int socketfd = -1;
struct addrinfo hints, *server_list = NULL, *server = NULL;
memset (&hints, 0, sizeof (hints));
hints.ai_family = AF_INET;       /* grab IPv4 only */
hints.ai_socktype = SOCK_STREAM; /* limit to byte streams */
hints.ai_protocol = IPPROTO_TCP; /* create as a TCP socket */

/* Get a list of addresses at hostname that serve HTTP */
getaddrinfo (hostname, "http", &hints, &server_list);

for (server = server_list; server != NULL; server = server->ai_next)
  {
    /* Attempt to create a TCP IPv4 socket */
    if ((socketfd = socket (server->ai_family, server->ai_socktype, 0)) < 0)
      continue;
    if (connect (socketfd, server->ai_addr, server->ai_addrlen) == 0)
      break;
    close (socketfd);
    socketfd = -1;
  }
freeaddrinfo (server_list);
if (socketfd < 0)
  exit (1);

/* ... begin HTTP session here ... */

4.4.3. Server Socket Interface¶

Setting up the server socket involves a different sequence of steps from the client process. As before, getaddrinfo() provides an interface for configuring the socket address information, but the arguments will be structured differently. For starters, we are not using getaddrinfo() to perform a DNS query on a remote host; instead, we are setting up the socket based on the current host’s existing IP address. In addition, if the server is part of a custom application, rather than a standard utility like a web server, the port number will not be identified as one of the well-known ports.

Code Listing 4.7 shows the differences in the parameters passed to getaddrinfo() for a server. First, the hints.ai_flags field is set to AI_PASSIVE and the nodename argument is set to NULL. This combination specifies that the socket will use the local host’s IP address and will be listening for incoming requests. Next, the servname parameter has been changed to "8000" to demonstrate how a custom port number can be used. Systems have existing processes set up for well-known ports, and each port number can only be assigned to a single process. Consequently, if we want to build our own web server, we would need to pick a random port number to use, 8000 in this case. To connect to this server once it is running, the client from Code Listing 4.6 would also need to be modified to pass "8000" to getaddrinfo() instead of "http" as the servname argument.

/* Code Listing 4.7:
   Getting socket address information for a server
 */

struct addrinfo hints, *server_info = NULL;
memset (&hints, 0, sizeof (hints));
hints.ai_family = AF_INET;       /* grab IPv4 only */
hints.ai_socktype = SOCK_STREAM; /* specify byte-streaming */
hints.ai_flags = AI_PASSIVE;     /* use default IP address */
hints.ai_protocol = IPPROTO_TCP; /* create as a TCP socket */

/* Get a list of addresses at hostname that serve HTTP */
getaddrinfo (NULL, "8000", &hints, &server_info);

Once the socket address information has been configured, the process can then make a sequence of function calls to become a server. Typically (though not required), the first call is to setsockopt() to configure the socket with the SO_REUSEADDR option. This option avoids a common error during the next step, bind(). The bind() call links the port number with the current process. Sometimes when a port number is reused, a timing problem can cause the previous process (which is no longer running) to fail to release the port fully. Setting the SO_REUSEADDR option tells bind() to ignore this and forcefully replace the port association.

C library functions – <sys/socket.h>

int setsockopt (int socket, int level, int option_name, const void *option_value, socklen_t option_len);: Configure internal settings for a socket.
int bind (int socket, const struct sockaddr *address, socklen_t address_len);: Assign a local sockaddr to a socket identifier; returns negative values if the bind cannot be done.

Code Listing 4.8 demonstrates using setsockopt() and bind() to extend the Code Listing 4.7 to set up a TCP server. From Code Listing 4.7, the port number requested for the server is 8000, which is the port number that would be stored in the struct sockaddr passed to bind(). The call to setsockopt() sets the SO_REUSEADDR option to true (1, stored in socket_option) at the socket level (SOL_SOCKET), meaning only this particular socket is affected. If the bind() is successful, the server is established. Another common option is SO_RCVTIMEO, which sets a time limit for blocking calls that read from the socket, allowing the process to close lost connections.

/* Code Listing 4.8:
   Setting up a connection-oriented server and receiving connections
 */

/* Extending Code Listing 4.7 */
int socket_option = 1;
for (server = server_info; server != NULL; server = server->ai_next)
  {
    /* Attempt to create a TCP socket */
    if ((socketfd = socket (server->ai_family, server->ai_socktype, 0)) < 0)
      continue;

    /* Configure the socket to ignore bind reuse error */
    setsockopt (socketfd, SOL_SOCKET, SO_REUSEADDR, (const void *) &socket_option,
                sizeof (int));
    /* Set a 5-second timeout when waiting to receive */
    struct timeval timeout = { 5, 0 };
    setsockopt (socketfd, SOL_SOCKET, SO_RCVTIMEO, (const void *) &timeout,
                sizeof (timeout));

    /* Bind the TCP socket to the port number */
    if (bind (socketfd, server->ai_addr, server->ai_addrlen) == 0)
      break;
    close (socketfd);
    socketfd = -1;
  }

freeaddrinfo (server_list);
if (socketfd < 0)
  {
    perror ("ERROR: Failed to bind socket");
    exit (1);
  }

Figure 4.4.12: Timing for a connection-less server that uses UDP

For connection-less protocols like UDP, no further action is needed to set up the server. The server can immediately begin waiting on incoming messages from the socket. Figure 4.4.12 illustrates the flow of the client and server functions calls for this scenario. Observe that connect() does not transmit any message across the network and only updates local settings on the client. To exchange data, the two processes would call sendto() and recvfrom(), which are explained in the next section.

Connection-oriented TCP sockets require two additional function calls. The first, listen(), converts the socket to a connection-oriented server socket with a designated request queue. The second parameter, backlog, can be used to modify the maximum number of enqueued connection requests. Setting this value to 0 will use the system default size, which varies depending on the system implementation of the C library. There is also a maximum allowable listen queue size, defined by the constant SOMAXCONN. Once the process has converted its socket to a server socket, repeated calls to accept() establish connections with incoming requests. The accept() function is blocking, so the process will wait at that point until a new request comes in. When a new request arrives, accept() performs the server side of the 3-way handshake to establish the connection, storing information about the client in the address and address_len fields.

C library functions – <sys/socket.h>

int listen (int socket, int backlog);: Convert the socket to a server socket that can accept incoming requests.
int accept (int socket, struct sockaddr *address, socklen_t *address_len);: Retrieve the first incoming connection, putting client’s information in the sockaddr.

Figure 4.4.14: Timing for a connection-oriented server that uses UDP

Figure 4.4.14 illustrates how the timing of the client and server functions relate. Both processes independently set up their sockets. The server then executes the sequence of calling bind(), listen(), and accept(). The call to accept() is blocking, so the server would then wait until a connection request arrives. When the request arrives, accept() would collect information about the client host and return. Contrast the functionality of connect() shown here with that shown in Figure 4.4.12. With TCP sockets, the client call to connect() sends an initial message to the server to establish the connection.

Code Listing 4.9 shows the final steps of setting up a TCP connection-oriented server and receiving requests, beginning with the call to listen(). After converting the socket to a server socket, the process enters a loop waiting on incoming connection requests. When accept() returns with a connection, the client’s IP address and port number are copied into the struct sockaddr. Note that the client’s port number will not be 8000 in this case, which is the port number chosen for this server. Instead, client sockets are assigned an ephemeral port, which is a pseudo-randomly selected integer in the range 1024 - 65535, when they are being set up. In this example, the server uses the inet_ntoa() and ntohs() utilities to print the client’s IP address and port number, then immediately closes the connection and frees the resources. If the client tries to read from or write to the socket at that point, the operation would fail and the client would get an error message back.

/* Code Listing 4.9:
   Receiving TCP connections
 */

/* Extending Code Listing 4.7 and 4.8 */

/* Convert to server socket */
listen (socketfd, 10);

/* Get the size of the sockaddr from getaddrinfo() results */
socklen_t addrlen = server->ai_addrlen;

while (1)
  {
    /* Allocate space for the incoming address info and get it */
    struct sockaddr *address = calloc (1, (size_t) addrlen);
    assert (address != NULL);

    int connection;
    if ((connection = accept (socketfd, address, &addrlen)) < 0)
      break;

    /* Print information about the connection and close it */
    struct sockaddr_in *addr = (struct sockaddr_in *)server->ai_addr;
    printf ("Incoming request from %s:%" PRI16d "\n", inet_ntoa (addr->sin_addr),
            ntohs (addr->sin_port));

    close (connection);
    free (address);
  }
close (socketfd);

4.4.4. Socket Communication¶

For TCP sockets, exchanging messages between the client and server can be done using the standard read() and write() operations, as with other forms of IPC. This works because sockets are treated like files, and the value returned from socket() behaves the same as any other file descriptor. UDP sockets require the use of recvfrom() and sendto() for data exchange. These functions use struct sockaddr parameters to determine the sender’s IP address when receiving and to specify the destination when sending. The read() and write() functions cannot serve this purpose, as the UDP socket identified by the file descriptor does not store this information. The recvfrom() and sendto() functions can also be used by TCP for consistency.

C library functions – <sys/socket.h>

ssize_t recvfrom (int socket, void *buffer, size_t length, int flags, struct sockaddr *address, socklen_t *address_len);: Receive up to the given length in bytes from a socket.
ssize_t sendto (int socket, const void *message, size_t length, int flags, const struct sockaddr *dest_addr, socklen_t dest_len);: Send a message to another host through a socket.

As with other socket-related functions, recvfrom() and sendto() use the generic struct sockaddr type, relying on the socklen_t parameter to determine its length and, indirectly, its specific type. Both functions take a void* parameter and a size that define the location of the bytes to send or to write received bytes. The flags parameter for each can specify advanced usage options. Code Listing 4.10 illustrates the use of these functions, sending a simple HTTP request and reading the first part of the response.

/* Code Listing 4.10:
   Sending to and receiving from an IPv6 socket
 */

/* Extending Code Listing 4.6 */

/* Create a message for a simple HTTP/1.0 request */
size_t buffer_len = 100;
char buffer[buffer_len];
memset (buffer, 0, buffer_len);
strncpy (buffer, "GET / HTTP/1.0\r\n\r\n", buffer_len);

/* When sending, you can check the number of bytes sent */
ssize_t bytes =
  sendto (socketfd, buffer, buffer_len, 0, server->ai_addr, server->ai_addrlen);

/* Copy the server IP address into a buffer */
char addr_buffer[INET6_ADDRSTRLEN];
inet_ntop (AF_INET6, &((struct sockaddr_in6 *)server->ai_addr)->sin6_addr,
           addr_buffer, sizeof (addr_buffer));
printf ("Sent %zd bytes to %s\n", bytes, addr_buffer);

/* Read all data into the buffer, keeping space for \0 at end */
while ((bytes = recvfrom (socketfd, buffer, buffer_len - 1, 0, server->ai_addr,
                          &server->ai_addrlen)) > 0)
  {
    printf ("%s", buffer);
    memset (buffer, 0, buffer_len);
  }
close (socketfd);

Code Listing 4.10 assumes that a client socket has already been created and has connected to a web server, as shown in Code Listing 4.6. For simplicity regarding the call to inet_ntop(), we are assuming this socket uses an IPv6; this assumption is only needed to print the IP address out. When receiving, since the client does not know the number of bytes total that will be sent, the client enters a loop that repeatedly requests enough data to fill up the buffer. Since recvfrom() is a blocking call, the process will wait if there are delays in the network. When the server is finished sending, recvfrom() will return zero and the client will exit the loop.

[1]	The `"_in"` part of `sockaddr_in` means “Internet,” not “input.” Relatedly, the field names have been changed from `sa_` (socket address) to `sin_` (Internet socket address). For IPv6, the name of the `struct` becomes `sockaddr_in6` and the field names begin with `sin6_`.

[2]	OS that are derived from BSD UNIX, including macOS, have an additional field in both `sockaddr_in` and `sockaddr_in6` to specify the length of the `struct`. This field is omitted from our discussion, as it is not universal.