5.6. Link Layer
The Internet layer determines the logical path that packets will traverse through a local network
and the Internet as a whole. The link layer defines the protocols that control how the bits are
transmitted across an underlying physical technology. For example, the network shown previously in
Figure 5.5.3 places an emphasis on routing a packet through a series
of routers in an AS. These routers may be part of distinct networks that use different underlying
technologies. For instance, each router may be operated and administered by separate departments
that are part of the same organization; some links may consist of wireless connections, while other
logical links are created by a chain of switches—router-like devices that are connected by
cables. The link layer, then, focuses on the task of forwarding packets across
point-to-point connections between routers and end-point devices.
The distinction of routing and forwarding, like the distinction between routers and switches, is
subtle and can be misunderstood. In essence, the classification of routing/routers is used to
describe the communication between heterogeneous networks that may rely on different types of
communication technologies. One network might employ a packet switching technology (e.g.,
Ethernet) that uses a structured message format that allows any device to send and receive data at
any time. A router might connect that network to one that uses circuit switching (e.g., FDDI
or token ring), in which two devices communicate directly over a dedicated channel; other devices
may be connected to the network, but they have to wait until it is their turn to control the
transmission channel. In contrast, the classification of forwarding/switches refers to communication
within a homogeneous network with a single underlying technology. A switch does not receive a
message from one technology (Ethernet) and forward it using another (FDDI). Switches only serve as
the links between hosts in a single network.
The switching that occurs at the link layer creates another possible source of packet loss that
TCP’s reliability is intended to address. When a packet arrives, there is a processing delay
associated with the work to compute checksums, determine the higher-level protocol, and so on.
Queueing delays occur while the packet is waiting to be processed or
transmitted. Transmission delays are imposed by the work to encode the
data into light signals or radio waves. Since the light signals and radio waves must travel across
physical space, the packet also experiences propagation delays. These
delays occur at every switching link in the network, accumulating to create increasingly greater
round-trip times. Consequently, link-layer protocols strive to balance correctness (ensuring limited
re-transmissions) with efficient processing in order to avoid causing packet losses.
5.6.1. LAN Packet Transmission: Ethernet
Given the ubiquity of the technology, many readers probably associate the term Ethernet with the
cable. In actuality, Ethernet is a collection of standards defined and maintained by the
IEEE 802.3 working group. That is, Ethernet is not defined as a stand-alone protocol defined in an RFC;
Ethernet involves several protocols that are co-designed with the physical cable technology that
they use. These physical technologies can range from twisted-pair copper wires to
fiber-optic wires made of glass or plastic.
Regardless of the type of the physical medium used, all Ethernet frames maintain the same basic
structure, shown in Table 5.12. Note the use of the term octet rather than byte.
Although modern systems have generally settled on the use of the term byte to denote eight bits,
this connotation was not always true; some technologies used byte to refer to a basic addressable
unit of memory, which was not necessarily eight bits in size. An octet, however, must be exactly
Table 5.12: Structure of an Ethernet frame
preamble of an Ethernet frame consists of seven octets of
10101010 followed by a single
10101011. The purpose of the
preamble is to declare to a device intends to send a
frame and to synchronize the other devices to listen as receivers. The destination and source
addresses are 48-bit (6-octet) media access control (MAC) addresses. Unlike IP
addresses, MAC addresses are persistently associated with a hardware device and do not provide any
implication of the device’s logical location in the network. MAC addresses are determined by the
device manufacturer and are stored in either firmware or hard-wired storage. The type field of the
Ethernet frame determines which Ethernet protocol standard is being used. The payload contains the
Internet-layer data (e.g., an IP packet); the maximum size varies based on the version of Ethernet,
but most have a maximum transmit unit (MTU) size of approximately 1500 octets. Finally, the
frame ends with the
field checksum (
FCS), which is a 32-bit cyclic redundancy check
(CRC) calculation that provides a more robust error detection mechanism than checksums. As one
example of the difference, CRC values can detect when the order of the octets has been changed,
while checksums cannot.
The MTU size implies that a lot of network traffic requires multiple frames. Consider an HTTP
request to load a GIF containing an Internet meme showing a short video of cats (people on the
Internet love cat videos!). Image files tend to be multiple MB in size. If a single video is 3 MB in
size, that image alone would require 2098 blocks of 1500 bytes. However, each frame must also have
the TCP/UDP and IP headers attached, so some of the 1500 bytes is already accounted for. Using the
bare minimum of 20 bytes for TCP and 40 for IP, the image would now require 2185 frames. This
fragmentation exacerbates the reliability service of TCP, as all of these frames must be
successfully transmitted (repeatedly) before the RTT timeout occurs. If any frame fails to arrive on
time, the TCP client (i.e., the web browser) declares the entire image lost and provides the user
with a (generally unhelpful) error message that the connection timed out. Hence the reason that
OSPF, RIP, and BGP prioritize finding the shortest, most efficient path possible.
To illustrate the structure of an Ethernet frame, the following header extends the IPv4 datagram
from Example 5.5.1 (which extends the TCP segment from Example 5.3.2).
The destination field is the MAC address
f0-de-f1-2c-c2-2b, and the source field is the address
4f-5c-89-bd-33-2d. These identifiers are persistently associated with the networking hardware
components. The type 0800 indicates that this frame is using Ethernet II, the most common style of
Ethernet framing. Finally, the FCS is the 32-bit CRC calculation over the entire frame.
The figure below illustrates the complete structure of the Ethernet frame by combining this example
with Example 5.3.2 and NetIPExample. The frame begins with the Ethernet
header. The Ethernet payload combines the IPv4 header, TCP header, and HTTP header. (As a
request, the HTTP message body is empty and only the header is sent.) At the same time, the IPv4
payload consists of the TCP and HTTP headers, whereas the HTTP header is the payload of the TCP
Anatomy of a complete Ethernet frame with IPv4, TCP, and HTTP data
5.6.2. LAN Packet Transmission: ARP
Figure 5.6.3: Devices connected to the same Ethernet segment
The previous discussion of Ethernet introduced a new form of addressing to locate hosts within a
network. Figure 5.6.3 shows a simple Ethernet segment with two end devices and a
router; each of these three hosts has both a MAC address and an IP address. MAC addresses do not
have any logical relationship to the network topology itself, while IP addresses are logical
identifiers that are not tied to the hardware. As such, routers need some way to translate an IP
address into a MAC address. Without such a mapping, routers would not be able to encapsulate the IP
packet in an Ethernet frame for the intended host device.
The Address Resolution Protocol (ARP) is a simple protocol for establishing this mapping, as
defined in RFC 826. Assume that the two end host devices in Figure 5.6.3 need to communicate, with
the 192.168.1.2 host sending data to 192.168.1.3. The sender broadcasts an Ethernet frame containing
an ARP query to the reserved MAC address
ff-ff-ff-ff-ff-ff. All nodes receive the query, but
only the intended recipient, 192.168.1.3, replies. At that point, the 192.168.1.2 host stores this
mapping in a local cache for a period of time. After doing this, 192.168.1.2 can use the appropriate
destination MAC address to transmit the IP packets as needed.
ARP is an insecure protocol that assumes all connected devices behave correctly. In an ARP
cache poisoning attack, an adversary that has access to a network can respond to ARP queries with
its own MAC address. The protocol defines no authentication mechanism to confirm that the response
is correct. This weakness is often acceptable if networks are secured so that only authorized
devices can be used. However, in public settings, such as a free café Wi-Fi network, the assumption
of trust can break down, allowing devices to intercept messages intended for others.
5.6.3. What Lies Beneath: Carrier Signals
To summarize the Internet model up to this point, the application layer uses transport-layer
protocols to create a process-to-process logical communication channel. The transport layer
encapsulates this information in a host-to-host link using Internet-layer routing between
potentially heterogeneous networks. The link layer then provides the mechanism for point-to-point
data transmission in a homogeneous network using the same underlying physical technology. These
layers of abstraction leave one question remaining: How do the bits actually get transmitted from
one device to another?
Figure 5.6.5 illustrates the basic principles involved in the physical data
transmission. Fundamentally, all of the physical networking technologies are transmitting either
light or radio signals, both of which can be modeled as an oscillating waveform. The default signal
with no encoded information is called a carrier signal. This signal can be modulated
to encode information by manipulating one of three characteristics of waves:
frequency, amplitude, or phase. The frequency refers to the number of
oscillations in a given time, as illustrated by how many times the wave oscillates between a maximum
and minimum value. The amplitude denotes the height of the wave. The phase refers to the timing of
when the wave begins and ends, illustrated by the alignment of the maximum and minimum values.
Figure 5.6.5: Three techniques for modulating a carrier signal to encode bits
In phase shift keying (PSK), the carrier wave operates at a fixed frequency, but its phase
is manipulated by changing the sine and cosine of the inputs. The precise calculations depend on the
particular scheme being used, but these techniques generally all map the measurement values to
points on the complex number plane. In binary PSK (BPSK), there are two possible points to indicate
the values 0 or 1. Other schemes use more points to map multiple bits. For instance, each
measurement in quadrature PSK (QPSK) maps to one of four points to encode two bits; 8-PSK uses eight
points to encode three bits.
In frequency modulation (FM), the frequency is changed to be either faster or slower than
the carrier wave. If this technique were used on sound waves that were in a range audible to humans,
FM would correspond to making the pitch higher or lower than the default range. Amplitude
modulation (AM) keeps the frequency the same as the carrier wave but increases or decreases the
magnitude of the difference between the maximum and minimum values. In the audible range, this would
correspond to making the sound louder or softer.
FM and AM have long been used for analog signal transmission. Readers may associate these terms with
radio stations, and for good reason: Radio stations with AM channels use amplitude modulation to
encode sound in the range of 540 kHz to 1600 kHz (kHz = 1,000 cycles per second). FM radio stations
use frequency modulation to encode sound between 88 MHz and 108 MHz (MHz = 1,000,000 cycles per
second). So, a radio station that advertises itself as FM 101.1 is sending a stream of bits by
changing the frequency to be slightly above and slightly below 101.1 MHz. Note, though, that FM and
AM are not restricted to analog radio signals. All three techniques are used to modulate digital
signals, as well.
Consequently, when we say that Ethernet is sending the preamble of a frame by changing the
transmitted bit from 1 to 0, it means that the network device is using one of these techniques to
manipulate the signal it is transmitting. The received performs the corresponding de-modulation to
restore the carrier wave and records the transmitted bit. By coordinating this signal transmission,
the link layer can transmit a packet from one network device to another. These links can then be
chained together to establish network routing, leading to higher levels of communication protocols.