Project 1: Mini-ELF verifier
This project is based on a project originally written by Dr. Michael Kirkpatrick.
This project serves as an introduction to C programming and writing standard Linux command-line programs, and the goal is to reinforce the concept that information = bits + context.
You must read in a binary file and print its contents based on contextual information, first as simply a series of bytes and then according to the file's format. The file is formatted in a simplified version of the Executable and Linkable Format (ELF) used to store relocatable object files in Linux. The format is simplified, but it still very much resembles the format used to store "real" executables.
For generic project instructions, refer to the project guide.
Here is the path to the starter tarball file on stu:
/cs/students/cs261/f16/src/p1-check.tar.gz
UPDATE 9/14: One of the private tests was slightly mislabeled in the original distribution. It is fixed now in the tarball, and there is an updated private.o in the same folder on stu that you can copy into your tests folder to see the updated name (and a new test that does what the old name implied). No changes have been made to the project specification.
Mini-ELF format
The details of the Mini-ELF format are described by comments in elf.h. You will need to read the documentation in that file to proceed. For this project, you need only focus on reading the first 16 bytes of the file, which is the ELF header. This header provides information about how to interpret the rest of the file, which you will need to do in the next project. In this project, you should read the header into an in-memory data structure (provided for you in elf.h) and print the values accordingly.
We have included several example Mini-ELF files in the tests/inputs subfolder of the project distribution; you can use these to test your program from the command line.
Command-line parsing
For this project, your program must parse command-line parameters according to the following interface:
Usage: y86 <option(s)> mini-elf-file Options are: -h Display usage -H Show the Mini-ELF header
It is strongly recommended that you use the getopt() library function to parse the command-line parameters.
For this project, there are only two possible flags ("-h" and "-H"). The first flag ("-h") is the standard "help" option that is customary in Linux programs. If that flag is passed, your program should print the help text (use the provided usage_p1 function!) and exit immediately. The second flag ("-H" -- note that capitalization matters here as much as it does in the C language itself!) should cause your program to print the Mini-ELF header in the format described below.
Remember that you will also need to save the filename of the Mini-ELF object file that you must read, which should be returned to the program using the final parameter unless the parameters are invalid or the user passes "-h".
Output specification
If the -H option is passed, you must first print out the first 16 bytes of the file in a little-endian format similar to that produced by the hexdump utility. Then, you must re-interpret those 16 bytes as a Mini-ELF header, printing its contents in a well-formatted structure that is best demonstrated by example. The following sample output shows the expected output for some of the provided Mini-ELF test files (in the tests/inputs subfolder):
$ ./y86 -H tests/inputs/simple.o 00000000 01 00 00 01 10 00 02 00 58 00 70 00 45 4c 46 00 Mini-ELF version 1 Entry point 0x100 There are 2 program headers, starting at offset 16 (0x10) There is a symbol table starting at offset 88 (0x58) There is a string table starting at offset 112 (0x70) $ ./y86 -H tests/inputs/stripped.o 00000000 01 00 00 01 10 00 02 00 00 00 00 00 45 4c 46 00 Mini-ELF version 1 Entry point 0x100 There are 2 program headers, starting at offset 16 (0x10) There is no symbol table present There is no string table present
The first line is a simple little-endian hex dump of the Mini-ELF header data (i.e., the first 16 bytes of the file). The rest of the output contain contextual data from the Mini-ELF header, such as the version number, the entry point address, and information about the program headers and symbol/string tables. See elf.h for help understanding the information required and how to access it from the ELF header struct. If a table address is zero, that indicates that the table is not present in the Mini-ELF file and you should print the appropriate message as shown above.
Your output must match the expected output exactly. Note there's a double space after the file offset (which will always be "00000000" for this project because you're only printing the first 16 bytes), and then another double space separating the first eight bytes from the second eight bytes on the same row. This format improves visibility a bit, especially in the next project where you will be printing out much larger segments of hex data.
Requirements
Here are the required functions that you must implement in p1-check.c. We will use unit tests to exercise this portion of your submission.
- bool parse_command_line_p1 (int argc, char **argv, bool *header, char **file);
Using the command-line options passed in argv, set the boolean pointed to by header to true if the -H option is passed. Set file to point to the file name string. Return true if and only if valid arguments are passed. If the -h option is passed, you should also return true, but the function should should also print the help message and your program should exit immediately after returning (however, you should exit in main(), not in this function!). - bool read_header (FILE *file, elf_hdr_t *hdr);
Read bytes from the file into the space pointed to by hdr. This function should do error checking to make sure the header is valid. Return true if and only if a valid Mini-ELF header was read (e.g., the file is large enough and the header has the proper magic number). - void dump_header (elf_hdr_t hdr);
Print the Mini-ELF header passed in hdr according to the specification described above.
In addition, you must implement main() in main.c such that your program behaves as described above. We will use integration tests to exercise this portion of your submission. Make sure you use the functions from p1-check.o in main.c--do not re-invent the wheel!
Error checking
Robust software proactively checks for potential errors and responds appropriately when an error occurs. Failure to build robust software leads to security breaches, lost sales, and other problems; this failure is not acceptable. Our grading procedures will try to break your code. The following list is a sample (not complete) of the types of errors we may test:
- Passing NULL values in pointer parameters
- Passing names of files that do not exist or have permission restrictions that prevent reading
- Passing invalid command-line options
- Not passing a file name
- Passing the name of a file that is too small
- Passing a file that contains an invalid header
If the given file cannot be opened or contains an invalid header, your program should print the error message "Failed to read file" with a newline and exit with the EXIT_FAILURE code defined in stdlib.h.
The above list is not necessarily exhaustive, and you should think carefully about what sort of errors might occur so that you can prevent them or perform additional error checking as needed. In particular, we will also use valgrind to detect memory leaks. Failure to respond appropriately will result in grade reductions.
Hints
- Read and ask questions early. If there is any part of this document or the project files that you do not understand, you should ask it on Piazza as soon as possible.
- Start work early. You will be unable to finish this project if you leave it to the last few days before it is due. Set up your project work folder as soon as possible.
- Follow test-driven design. Before you write a single line of code, create test cases based on the output specification described above. Do NOT write any code without having a test case ready to test your code.
- Don't over-complicate. The reference solution for this project is around 175 lines of code in main.c and p1-check.c combined. If your solution grows to more than 250 or 300 lines, you may wish to re-evaluate your approach; it's likely that you are over-complicating.
- Use version control. Learn Git or Mercurial and keep your code in a repository. It will save you much time and anguish if you ever accidentally delete something.
- Learn to use a debugger. Debuggers allow you to "poke around" while your program is running to figure out where your mental model of the program differs from reality. This will make fixing problems much easier.
- Build features iteratively. Pay attention to the order of
requirements and add features one at a time. For this project, you could
start by hard-coding a file name, reading the first 16 bytes of that
file, and printing the hex values. Then get the file name from the
command line. Once you've got that working, then you should look at the
ELF header stuff. Until you can read a binary file, you shouldn't even be
thinking about that.
If you write 100 lines of code without compiling, you're doing it wrong; you will quickly become overwhelmed and frustrated with the number of bugs and compiler errors you will encounter. The key to success is to write a small amount of code, compile it, then test it. If after implementing the new functionality you fail a test case that you were passing previously, you should fix the regression before moving on to a new part of the project.
Grading
The following requirements are necessary to earn a grade of 'A' for this project:
- Read the 16-byte Mini-ELF header and print it in hex format
- Print the ELF header in the specified structured format
- Accept all valid command-line options
- Handle error checking (described above) appropriately
- Reject invalid command-line arguments
Completing steps 1, 2, and 3 are required to earn a grade of 'B' while completing only step 1 will yield a maximum grade of 'C'. Note that these are the maximum grades you can earn. Inadequate error checking, failure to adhere to the coding standards, or deviations from the submit procedures will result in grade reductions as described in the project guide.
Failure to submit code that compiles on stu.cs.jmu.edu will result in an automatic grade of 0.
Submission
Due: Fri, Sep 16 at 23:59:59 ET (midnight)
Please see the project guide for general project help and grading policies. Please refer to the coding standards for coding practice guidelines.