Background information
DNS is a distributed database system that is used to translate human-readable
domain names (e.g., www.jmu.edu) into IP addresses. The protocol
is defined in RFC 1034 and the
key data structures are defined in
RFC 1035.
In its simplest form, DNS is a stateless request-response protocol. A DNS
client sends a query to a server. This query contains one or more
questions that indicate the domain name under consideration and
the type of DNS record being retrieved. The server will then try to match
this information with the requested record. If found, the server's response
will include one or more answers to the query.
The textbook provides an overview of the structuere of queries, responses,
and resource records, with more details in
RFC 1035. The following
example illustrates a query for the IPv4 address for jmu.edu:
12 34 01 00 00 01 00 00 00 00 00 00 03 6a 6d 75 03 65 64 75 00 00 01 00 01
The first twelve bytes are the header and can be interpreted as follows:
1234 |
XID=0x1234 |
random identifier |
0100 |
OPCODE=SQUERY |
message is a request |
0001 |
QDCOUNT=1 |
1 question is asked |
0000 |
ANCOUNT=0 |
0 answers provided |
0000 |
NSCOUNT=0 |
0 authoritative records provided |
0000 |
ARCOUNT=0 |
0 additional information records provided |
The remaining thirteen bytes are the question asked in this
query. In DNS, domain names are not written in the standard dotted notation.
Instead, one byte is used to indicate the length of the next portion of the
address. So 03 6a 6d 75 is the "jmu" portion
followed by 03 65 64 74 ("edu"). The next byte is
the null byte (00) to indicate the end of the address.
The final four bytes indicate the QCLASS is 00 01
(IN, which indicates "Internet") and QTYPE is
00 01 (A record, which indicates IPv4). In this
project, all records will use the IN value for the
QCLASS value, but you will support different QTYPE
records.
The corresponding response would be:
12 34 81 80 00 01 00 01 00 00 00 00 03 6a 6d 75 03 65 64 75 00 00 01 00 01 c0 0c 00 01 00 01 00 00 03 84 04 86 7e 7e 63
The first 25 bytes of this response are an exact copy of the request with
two differences. Bytes three and four (81 80) set additional
flags (RESPONSE and RA) to indicate that it is a
response and recursive lookups are available. Bytes seven and eight indicate
that one answer is provided.
The answer starts with the bytes c0 0c. The
structure of A records begins with the domain name. However, DNS
compresses the responses by avoiding repetition. The first byte
(c0) indicates the domain name is compressed to repeat the bytes
starting at offset 0c (byte 12). (See the encoding of
jmu.edu described above.)
The remaining bytes indicate the QTYPE is 00 01
(QTYPE=A), the QCLASS is 00 01
(QCLASS=IN), the TTL is 900 (0x384) seconds, the
size of the data (RDLENGTH) is 4 bytes and the data result
(RDATA) is 86 7e 7e 63 (134.126.126.99).
In this project, you will be implementing a DNS client, formatting
queries according to the structure above. Your client will receive the
responses as the binary data and print out the results in a manner similar
to the dig command-line utility.
Implementation requirements
This project is designed to be completed incrementally in multiple phases.
You should plan on an average of 10-14 work days for each phase. If you commit
to this schedule, you will be able to complete all phases by the final deadline.
Phase 1: Socket communication basics
Your first task is to build a hard-coded request to a provided DNS server
and interpret the results. You will use a hard-coded XID value of
1 and an empty domain name. That is, your request will consist of the
following bytes:
00 01 01 00 00 01 00 00 00 00 00 00 00 00 01 00 01
The response that comes back will be for one of the root servers. The
server will select one to use for the reply randomly, and you will need to
report the results based on the data received. For the structure of the
output, see the files in p2-dns/tests/expected
NOTE: DNS utilities like
dig have a convention of appending a dot on the end of an
interpreted domain name. As such, you should make sure to indicate that the
domain name is a.root-servers.net. (with the dot) rather than
a.root-servers.net (without).
Testing your client
Throughout this project, you are building a client that will interact with
a pre-built server. Due to the nature of network-based communication, you
should NOT rely on make test for testing your
code. Specifically, doing so, you would not be able to distinguish between
your client failing to send the data, your client sending invalid data, and
your client failing to retrieve the response.
Instead, you will need to use two terminal windows to start the server
manually then running your client separately to send the request. In the
p2-dns/tests directory, running ./dukens -s 10
will start the server and wait up to 10 seconds for a request. You can adjust
the wait time with a different -s argument. You can then run
your client as ./digduke in the p2-dns directory.
The window running the server will provide helpful information about your
code's functionality. When the server receives a packet, it will display
the bytes it received. The server will also try to interpret the query's
question and display the bytes it is sending in response.
Phase 2: DNS queries for IPv4 records (C requirements)
Once you have basic network communication working, you will add support
for sending requests and receiving responses for A records,
both with and without compression. The format of your output will be similar
to that used by dig. For example, the response described in the
background section of this page would be formatted as follows:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4660
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;jmu.edu. IN A
;; ANSWER SECTION:
jmu.edu. IN A 900. 134.126.126.99
Much of this output can be compared with the example described above.
The status and flags field warrant more explanation.
Despite their separate labels in this output, they both derive from the
third and fourth bytes of the response header. In the example above, these
bytes were 81 80. These indicate the qr (query
result), rd (recursion desired), and qa (recursion
available) bits are set. You will also need to detect whether the
aa (authoritative answer) bit is set (resulting in 8580).
The last hex digit is used to indicate the status. A value
of 0 (NOERROR) was successful. A value of
2 (SERVFAIL) would indicate that the query failed
because of a bad domain name.
Phase 3: Multiple responses (B requirements)
In the previous phase, all requests were for IPv4 records and the
response contained a single A record. In this phase, you'll
extend your client to support a wider array of responses. As a first step,
you will handle responses that return multiple IPv4 records rather than a
single one. In practice, a network client (e.g., a web browser) would
select one of these answers at random to use. The purpose of having multiple
IP addresses for something like a web server is to distribute the workload
across multiple instances of the server rather than sending all requests to
a single centralized server, making it vulnerable to being overloaded.
Your next step will be to support sending and processing requests for
other types of DNS records. Specifically, you'll add support for IPv6
addresses (AAAA records), SMTP mail exchange servers
(MX records), and DNS name servers (NS records).
You will also need to handle a few circumstances that are not immediately
obvious. First, some of the responses will use compression techniques more
than once. Second, you will need to support MX records that have
empty domain names (used to indicate that there is no such mail server).
Finally, you'll need to handle the case where a record is sought but the
response contains no answers. Note that, in this last case, the query is
considered to have a NOERROR status; it is just that the
number of answers is 0.
Phase 4: More advanced records (A requirements)
In this final phase, you'll add support for a few more types of DNS
records, including canonical names (CNAME), reverse DNS lookup
(PTR), and start-of-authority (SOA) records. In
contrast to the previous record types, these rely on more than just the
ANSWER. Rather, they will also rely on the ADDITIONAL
and AUTHORITY fields to convey other information. Your client
will need to examine the NSCOUNT and ARCOUNT
fields to determine if they are used.
For CNAME results, the record itself indicates the canonical
name (e.g., stu.cs.jmu.edu for the domain name stu).
In some circumstances, the DNS server also contains record entries for this
canonical name. These records may include the IPv4 address. These records
may be returned in either the ADDITIONAL field or as more records
in the ANSWER field.
PTR record lookups require two special considerations. First,
the IPv4 address that is being considered must be converted to an
.in-addr.arpa domain name. In doing so, the address must be
reversed to reflect the hierarchical nature of IPv4. For example, the
address 1.2.3.4 would be converted to
4.3.2.1.in-addr.arpa. Second, PTR results may be
accompanied by CNAME records to indicate which result should be
considered definitive.
Finally, you will need to add support for additional error conditions.
This could include NXDOMAIN status fields to indicate an error
or returning SOA records for IPv6 queries that do not have
answers.