WARNING: Before starting on this project, you
should be familiar with the Testing Procedures
and the Project Submission Procedures, as these
are different from those used in CS 261.
Background information
DNS is a distributed database system that is used to translate human-readable
domain names (e.g., www.jmu.edu) into IP addresses. The protocol
is defined in RFC 1034 and the
key data structures are defined in
RFC 1035.
In its simplest form, DNS is a stateless request-response protocol. A DNS
client sends a query to a server. This query contains one or more
questions that indicate the domain name under consideration and
the type of DNS record being retrieved. The server will then try to match
this information with the requested record. If found, the server's response
will include one or more answers to the query.
The textbook provides an overview of the structuere of queries, responses,
and resource records, with more details in
RFC 1035. The following
example illustrates a query for the IPv4 address for jmu.edu:
12 34 01 00 00 01 00 00 00 00 00 00 03 6a 6d 75 03 65 64 75 00 00 01 00 01
The first twelve bytes are the header and can be interpreted as follows:
1234 |
XID=0x1234 |
random identifier |
0100 |
OPCODE=SQUERY |
message is a request |
0001 |
QDCOUNT=1 |
1 question is asked |
0000 |
ANCOUNT=0 |
0 answers provided |
0000 |
NSCOUNT=0 |
0 authoritative records provided |
0000 |
ARCOUNT=0 |
0 additional information records provided |
The remaining thirteen bytes are the question asked in this
query. In DNS, domain names are not written in the standard dotted notation.
Instead, one byte is used to indicate the length of the next portion of the
address. So 03 6a 6d 75 is the "jmu" portion
followed by 03 65 64 74 ("edu"). The next byte is
the null byte (00) to indicate the end of the address.
The final four bytes indicate the QCLASS is 00 01
(IN, which indicates "Internet") and QTYPE is
00 01 (A record, which indicates IPv4). In this
project, all records will use the IN value for the
QCLASS value, but you will support different QTYPE
records.
The corresponding response would be:
12 34 81 80 00 01 00 01 00 00 00 00 03 6a 6d 75 03 65 64 75 00 00 01 00 01 c0 0c 00 01 00 01 00 00 03 84 04 86 7e 7e 63
The first 25 bytes of this response are an exact copy of the request with
two differences. Bytes three and four (81 80) set additional
flags (RESPONSE and RA) to indicate that it is a
response and recursive lookups are available. Bytes seven and eight indicate
that one answer is provided.
The answer starts with the bytes c0 0c. The
structure of A records begins with the domain name. However, DNS
compresses the responses by avoiding repetition. The first byte
(c0) indicates the domain name is compressed to repeat the bytes
starting at offset 0c (byte 12). (See the encoding of
jmu.edu described above.)
The remaining bytes indicate the QTYPE is 00 01
(QTYPE=A), the QCLASS is 00 01
(QCLASS=IN), the TTL is 900 (0x384) seconds, the
size of the data (RDLENGTH) is 4 bytes and the data result
(RDATA) is 86 7e 7e 63 (134.126.126.99).
Project directory structure
The project directory structure is mostly described in the
Testing Documentation.
You will be modifying the files in p1-sh/src.
In this project, you will be implementing a DNS client and server.
The client (digduke) is driven by code in
src/client.c and the server (dukens) is driven by
src/server.c. Because there is so much common functionality
needed by both, there are additional files (e.g., src/dns.c)
that you should use for this purpose.
Implementation requirements
Your first task is to build a minimal client that sends and processes DNS
data using a socket to a provided server. You will extend this implementation
with incrementally more features of DNS. In the later stages, you will switch
roles to implement the server using a provided client. Later, you will
incorporate multithreading into the client and server for concurrent processing.
Getting started: BASIC requirements
Your first task is to build a hard-coded request to a provided DNS server
and interpret the results. You will use a hard-coded XID value of
1 and an empty domain name. That is, your request will consist of the
following bytes:
00 01 01 00 00 01 00 00 00 00 00 00 00 00 01 00 01
The response that comes back will be for one of the root servers. The
server will select one to use for the reply randomly, and you will need to
report the results based on the data received. For the structure of the
output, see the files in p2-dns/tests/expected
NOTE: DNS utilities like
dig have a convention of appending a dot on the end of an
interpreted domain name. As such, you should make sure to indicate that the
domain name is a.root-servers.net. (with the dot) rather than
a.root-servers.net (without).
Testing your client
Throughout this project, you are building network applications that interact
with pre-built components. Due to the nature of network-based communication, you
should NOT rely on make test for testing your
code. Specifically, doing so, you would not be able to distinguish between
your client failing to send the data, your client sending invalid data, and
your client failing to retrieve the response.
Instead, you will need to use two terminal windows to start the server
manually then running your client separately to send the request. In the
p2-dns/tests directory, running ./dukens -s 10
will start the server and wait up to 10 seconds for a request. You can adjust
the wait time with a different -s argument. You can then run
your client as ./digduke in the p2-dns directory.
(In later phases, you will start your ./dukens server in
p2-dns then run the provided ./digduke client
from p2-dns/tests.)
The window running the server will provide helpful information about your
code's functionality. When the server receives a packet, it will display
the bytes it received. The server will also try to interpret the query's
question and display the bytes it is sending in response.
MIN requirements: DNS record types
Once you have basic network communication working, you will add support
for sending requests and receiving responses for A records,
both with and without compression. The format of your output will be similar
to that used by dig. For example, the response described in the
background section of this page would be formatted as follows:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4660
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;jmu.edu. IN A
;; ANSWER SECTION:
jmu.edu. IN A 900. 134.126.126.99
Much of this output can be compared with the example described above.
The status and flags field warrant more explanation.
Despite their separate labels in this output, they both derive from the
third and fourth bytes of the response header. In the example above, these
bytes were 81 80. These indicate the qr (query
result), rd (recursion desired), and qa (recursion
available) bits are set. You will also need to detect whether the
aa (authoritative answer) bit is set (resulting in 8580).
The last hex digit is used to indicate the status. A value
of 0 (NOERROR) was successful. A value of
2 (SERVFAIL) would indicate that the query failed
because of a bad domain name.
Your next step will be to support sending and processing requests for
other types of DNS records. Specifically, you'll add support for IPv6
addresses (AAAA records), SMTP mail exchange servers
(MX records), and DNS name servers (NS records).
You will also need to handle a few circumstances that are not immediately
obvious. First, some of the responses will use compression techniques more
than once. Second, you will need to support MX records that have
empty domain names (used to indicate that there is no such mail server).
Finally, you'll need to handle the case where a record is sought but the
response contains no answers. Note that, in this last case, the query is
considered to have a NOERROR status; it is just that the
number of answers is 0.
The provided tests/dukens server
can be run in a stand-alone mode to help you debug. You can pass a domain
name and record type to this program as an argument and it will show you what
the request packet should look like, what answers were found in the database,
and what the response would be in hex format. If you also pass the
-t flag to this query, it will list all of the records in the
database that we are using. For instance, try running the query
./tests/dukens -t google.com A.
INTER requirements: More complex records and a basic server
In this third phase, you'll add support for a few more types of DNS
records, including canonical names (CNAME) and
start-of-authority (SOA) records. In
contrast to the previous record types, these rely on more than just the
ANSWER. Rather, they will also rely on the ADDITIONAL
and AUTHORITY fields to convey other information. Your client
will need to examine the NSCOUNT and ARCOUNT
fields to determine if they are used.
For CNAME results, the record itself indicates the canonical
name (e.g., stu.cs.jmu.edu for the domain name stu).
In some circumstances, the DNS server also contains record entries for this
canonical name. These records may include the IPv4 address. These records
may be returned in either the ADDITIONAL field or as more records
in the ANSWER field. You will need to add support for additional
error conditions. This could include NXDOMAIN status fields to
indicate an error or returning SOA records for IPv6 queries that
do not have answers.
The last portion of this phase will be to implement a basic DNS server.
The code distribution contains a binary file that we will use as our database.
The records in this file are structured based on the key DNS components.
Your first task for this portion is to read this file in and to build an
in-memory table that you can use to look up the records. As an example,
consider this portion of the file:
$ hexdump -C tests/mappings.bin | tail -n 13 | head -n 2
00000780 00 00 03 6a 6d 75 03 65 64 75 00 01 00 84 03 00 |...jmu.edu......|
00000790 00 04 00 86 7e 7e 63 09 6c 6f 63 61 6c 68 6f 73 |....~~c.localhos|
Because we are just pulling a portion from the middle of the file, this
output starts in the middle of a record. The record that we are considering
is the bytes 03 6a 6d...7e 7e 63. The structure of all records
is as follows (using this record as an example):
- Domain name (variable length)
03 6a 6d 75 03 65 64 75 00 = jmu.edu
- Record type (2 bytes)
01 00 = type 1 (A record)
- Record TTL (4 bytes)
84 03 00 00 = 0x384 = 900 seconds
RDATA length (2 bytes)
04 00 = 4 bytes
RDATA (variable length)
86 7e 7e 63 = 134.126.126.99
For multi-byte fields (type, TTL, and length), the value is in
little-endian format, so you'll have to convert that as needed. Recall that
you can run the provided tests/dukens with queries to check that
your interpretation of the data is correct.
Once you've built the table, your task is to implement a basic server
responding to the first types of queries you built in your client. That is,
if the XID is 1, you'll build the response like the
ping test case. If XID is 2, you'll provide a basic
response for an A record without compression. Then add support for A records
with compression.
ADV requirements: Additional features and iterative lookups
The advanced features for this project start with supporting PTR
records. PTR record lookups require two special considerations. First,
the IPv4 address that is being considered must be converted to an
.in-addr.arpa domain name. In doing so, the address must be
reversed to reflect the hierarchical nature of IPv4. For example, the
address 1.2.3.4 would be converted to
4.3.2.1.in-addr.arpa. Second, PTR results may be
accompanied by CNAME records to indicate which result should be
considered definitive. The ptr test cases focus on support for
PTR records in your client.
The sadv test cases require expanding your server's lookup
functioning to add support for SOA, PTR,
NS, and AAAA records. In these test cases, you will
be modifying your server to function similarly to the provided dukens
implementation. That is, if your server is passed a domain name and record
type, it will not actually run as a server. Instead, it will look up the
record in the database and print out the fake request packet, the list of
answers, and the response block.
The last two tests involve modifying the client to simulate an iterative
DNS lookup. Specifically, assuming your client is looking up the IPv4 address
for stu.cs.jmu.edu, your client would need to do the following:
- Send a request to the server to find a root (
.)
NS server. The server will reply with a randomly selected
root server.
- Determine the name of the root server returned. Send a second request
to get that root server's IPv4 address.
- Repeat the previous two steps to get a top-level domain
NS
server for the specified hostname, along with that server's IPv4 address.
In this case, you could look for the NS and A
records for the edu TLD.
- Send a request for the authoritative name server, which would be the
NS record for jmu.edu.
- Now get the IPv4 address for the hostname.
Note that you will still be contacting the same tests/dukens
on localhost (stu) for all requests, because we are
simulating recursive lookups. In a real implementation, your request
for the edu NS record (step 3) would be sent to the
IP address of the root server you got in step 2. Similarly, the request for
JMU's NS record would be sent to the IP address for the
edu TLD.
For simplicity, we will only be working with the edu TLD.
However, we would note that your code would work for any other TLD, so long
as the mappings.bin database had records for those servers.
WARNING: It may be tempting to
hard-code the requests to use the same root, TLD, and authoritative name
servers (e.g., a.root-servers.net, a.edu-servers.net,
and it-ns1-19.jmu.edu). Because the server randomly chooses which
servers to return in its response, this approach would succeed 1 out of every
18 tries. However, when this is tested, it will be run multiple times to
determine that you are responding to the actual results returned.
Finally, you last task is to make your iterative client multithreaded.
That is, you will be given multiple hostnames to do an iterative lookup on.
In doing so, you will create a separate thread for each hostname. Your
implementation must show that it is truly multithreaded by handling the
introduction of random delays on the server side. For instance, assume your
code is doing lookups for both stu.cs.jmu.edu and
www.jmu.edu. A sampling of the order of output might be:
stu thread: send request for root NS
www thread: send request for root NS
www thread: receive root NS
www thread: send request for root A
stu thread: receive root NS
www thread: receive root A
www thread: send request for edu NS
stu thread: send request for root A
- ...
Running the code again might produce a different order of these messages.
For instance, message 5 (stu thread receiving the NS
record) might be received as the second message before the www
thread even starts.
Your code must use a lock to ensure that the messages produced do not
get improperly interleaved. That is, you must make sure that your client
writes all of the output for step 3 before it writes any of the output for
step 5. It is inacceptable for the lines of output for these two steps to
be mixed.