Sorting PA

Introduction

The objective of this assignment is to explore various sorting algorithms.

Getting Started

First download and decompress one of the starter files:

Don't forget to download the appropriate Makefile for your platform from Piazza or the course website.

You will probably want to copy and paste your lab solutions. If you completed the lab and wrote the comparisons using the cmp function, you shouldn't need to modify your code much for this PA.

New Data Type

If you look in the distribution files, you will see that we have changed the list data types in sort.h. Instead of simple integers, they are now compound employee records consisting of an ID number and a salary:

/*
 * Employee info struct
 */
typedef struct {
    int id;
    int salary;
} employee_t;

/*
 * Generic data type.
 */
typedef employee_t* data_t;

Note that data_t is now a pointer to a struct rather than a data item. This means that the arrays that you will be sorting will be arrays of pointers, or referential arrays. This allows you to sort an array of employee information without actually moving the employee data around; you'll simply be moving around pointers to the data.

This allows us to test whether your sorting routines are stable: if we sort based on salary, we can check the ID numbers to see whether the sorting routine re-ordered employees with identical salaries.

The change in data_t, however, doesn't have much impact on your sorting routines as long as you used the cmp function in the lab, which handled the task of comparing two elements.

In this PA, we will primarily be concerned with sorting employee records by salary in ascending order. This necessitates a new cmp function, provided in sort.c:

int cmp(data_t a, data_t b)
{
    return (a->salary) - (b->salary);
}

This cmp function works the same way as the one from the lab, but now it does the comparison based on a the salary field of an employee record.

Part 1: Finish ALL the sorts!

Complete your implementations of all of the required algorithms as specified in the sorting lab. Here are some notes on our expectations of your implementations:

  • Bubble Sort - Your implementation should short-circuit whenever the remainder of the list is sorted. However, the short-circuit check should not involve significant additional computation! In other words, the check should run in O(1) time at each top-level iteration. This sort should be stable.

  • Selection Sort - The running time of your implementation should not depend at all upon whether the list is already sorted.

  • Insertion Sort - Your implementation should be significantly faster on nearly-sorted lists than on randomly-shuffled lists. In other words, each insertion step should stop once the new item has been placed in the correct location. This sort should be stable.

  • Merge Sort - Your implementation should be stable and should have O(n log n) worst-case running time. It should allocate new arrays at every level of recursion.

  • Quick Sort - Your implementation should be in-place and should have O(n log n) expected running time. It should choose a random element from the list to use as the pivot value.

Part 2: Improve the Sorts

Write three new sorting routines that are variants of existing sorts:

  • Binary Insertion Sort - Implement a variant of insertion sort that uses a binary search to find the location for each new element. For most inputs, this should be faster than your original insertion sort implementation because it performs fewer comparisons. This sort should be stable.

  • Swapping Merge Sort - Implement a variant of merge sort that only uses n extra space by swapping back and forth between the original array and a work array while merging. You will probably need to create a recursive helper function for this implementation. For most inputs, this should be faster than your original merge sort implementation because it allocates less space. This sort should be stable.

  • Midpoint Quick Sort - Implement a simple variant of quicksort that uses the middle element as the pivot instead of a randomly-selected element. For most inputs, this should be marginally faster than your original quick sort implementation because it avoids the overhead of the rand() function calls.

You should implement all of the sorts in this PA (including both Part 1 and Part 2) from scratch. In other words, you should not use the built-in qsort routine from the C standard library, nor should you simply copy/paste code from a website. You are welcome to use any references you'd like to help you develop your solution, though. If you do use a reference, please cite it in a comment.

In addition, none of your sorting routines should leak any memory. To verify this, it is sufficient to run valgrind on the main program.

You may add your own sorting routines to this PA framework just like you could for the sorting lab (see the last section for instructions), but you should make sure that none of the required sorts depend on your new sorting routines. We will not ask you to submit your main.c.

Final Submission

DUE DATE: Friday, November 13, 2015 at 23:59 EDT (11:59pm)

Submit ONLY your completed sort.c file on Canvas using the appropriate assignment submission. Make sure the file contains a comment field at the top with your name and a statement of compliance with the JMU honor code.

Please check your code carefully and make sure that it complies with the project specification. In addition, make sure that your code adheres to general good coding style guidelines. Fix any spelling mistakes, incorrect code formatting, and inconsistent whitespace before submitting. Make sure you include any appropriate documentation.