Sorting

Introduction - What is the World's Best Sorting Algorithm?

There is no single best sorting algorithm. Quicksort is the fastest known comparison-based sorting algorithm when applied to large, unordered, sequences. It also has the advantage of being an in-place (or nearly in-place) sort. Unfortunately, quicksort has some weaknesses: its worst-case performance is \(\Theta(n^2)\), and it is not stable. Merge sort shares neither of these disadvantages: it is stable and it requires \(\Theta(n \log n)\) steps in the worst case. Unfortunately, merge sort requires \(\Theta(n)\) additional space and it runs more slowly than quicksort on most inputs.

What we really want is a sorting algorithm that is as fast as quicksort, stable, in-place, with \(\Theta(n \log n)\) worst-case performance. Sadly, no such algorithm has yet been discovered. In practice, most code libraries follow one of two paths: either they provide a modified version of quicksort that is able to avoid worst-case behavior on typical inputs, or they provide a modified version of merge sort that is able to close some of the performance gap through careful optimizations^*.

The objective of this assignment is to write modified versions of merge and quicksort that exhibit better performance than the standard versions described in our textbook. You will implement three improvements (two for merge sort and one for quicksort) that should improve some aspect of the algorithm's performance without requiring a great deal of additional code. You will also experimentally evaluate the impact of your improvements.

The following zip file contains starter code, including the sort implementations provided by our textbook.

sorting.zip

This zip file has the following contents:

Sorting algorithms:
- BasicSorts.java - Includes selection sort and insertion sort.
- MergeSort.java - Includes merge sort.
- QuickSort.java - Includes quicksort.
- MergeSortImproved.java (UNFINISHED. See Parts 1 & 2 below.)
- IntrospectiveSort.java (UNFINISHED. See Part 3 below.)
An application for timing sorting algorithms:
- SortProfiler.java

Note that SortProfiler relies on the JOpt Simple library for handling command-line arguments. You will need to have jopt-simple-5.0.4.jar in your classpath to use the application.

Part 1 - Improved Merges

The merge algorithm described in our textbook consists of the following two stages:

1. Copy all values from the two pre-sorted halves into a temporary array.
2. Merge the values from the temporary array back into the original array.

If \(n\) is the combined size of the two sub-arrays being merged, this algorithm requires a temporary array of size \(n\), and requires \(n\) assignment operations in stage 1. The following alternative approach cuts both of those values from \(n\) to around \(n/2\):

1. Copy the values from the first pre-sorted half to a temporary array.
2. Merge the values from the temporary array and the second
   merge-sorted half into the original array.

Here are some ASCII graphics illustrating the process:

Original sorted sub-arrays:

      ________________________________________
  ...| 1 | 3 | 5 | 7 | 9 | 2 | 4 | 6 | 8 | 10 |...
      ----------------------------------------
 start-^           mid-^               end-^

Temporary Array

      ___________________
     |   |   |   |   |   |...
      -------------------
     0-^

      ________________________________________
  ...| 1 | 3 | 5 | 7 | 9 | 2 | 4 | 6 | 8 | 10 |...
      ----------------------------------------
 start-^           mid-^               end-^

Copy the sorted first half into the temporary array:

         ___________________
        | 1 | 3 | 5 | 7 | 9 |...
         -------------------

         ________________________________________
     ...|   |   |   |   |   | 2 | 4 | 6 | 8 | 10 |...
         ----------------------------------------

Initialize `i1` to be the index of the first position (0) in the temporary array, `i2` to be the index of the first position of the sorted right half, and `curr` to be the next available position for merging:

         ___________________
        | 1 | 3 | 5 | 7 | 9 |...
         -------------------
       i1-^
         ________________________________________
     ...|   |   |   |   |   | 2 | 4 | 6 | 8 | 10 |...
         ----------------------------------------
     curr-^                i2-^

Since `temp[i1] < items[i2]`, `temp[i1]` is copied to `items[curr]`. `i1` and `curr` are incremented.

       ___________________
      | 1 | 3 | 5 | 7 | 9 |...
       -------------------
         i1-^
       ________________________________________
   ...| 1 |   |   |   |   | 2 | 4 | 6 | 8 | 10 |...
       ----------------------------------------
       curr-^            i2-^

Since `items[i2] < temp[i1]`, `items[i2]` is copied to `items[curr]`. `i2` and `curr` are incremented.

       ___________________
      | 1 | 3 | 5 | 7 | 9 |...
       -------------------
         i1-^
       ________________________________________
   ...| 1 | 2 |   |   |   | 2 | 4 | 6 | 8 | 10 |...
       ----------------------------------------
           curr-^            i2-^

Continue Merging until complete:

       ___________________
      | 1 | 3 | 5 | 7 | 9 |...
       -------------------
       ________________________________________
   ...| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |...
       ----------------------------------------

Make sure to test your modified sorting algorithm to confirm that it actually sorts correctly! Once you are confident that it is working, use the sort profiling tool to evaluate your finished implementation.

Part 2 - Switching Strategies

The next improvement is based on the observation that merge sort is actually slower than simple \(\Theta(n^2)\) sorts for small input sizes. This may seem surprising given that merge sort is an \(\Theta(n \log n)\) sorting algorithm. However, it is important to keep in mind that asymptotic analysis is only concerned with rates of growth. A \(\Theta(n \log n)\) algorithm will always be faster than a \(\Theta(n^2)\) algorithm eventually, but that doesn't mean the \(\Theta(n^2)\) algorithm can't be faster for small inputs. The following figure was created by timing merge sort and insertion sort on small randomly ordered arrays from size 2 to size 150:

Sort timings

As you can see, insertion sort is faster until around \(n = 90\). At that point, merge sort becomes faster and it remains faster for all larger inputs.

As a reminder, the following pseudocode describes the overall logic of the merge sort Algorithm:

merge_sort(sub-array)
    If sub-array has more than one entry: 
        Recursively merge_sort the left half
        Recursively merge_sort the right half
        Merge the two sorted halves.

This logic recursively splits the original array into smaller and smaller sub-arrays until the recursion bottoms out at sub-arrays of size one. This means that every time a large array is sorted, there are many recursive calls to merge sort that have small input sizes. In light of the figure above, that approach doesn't make much sense: merge sort is not a competitive sorting algorithm on small inputs. It would make more sense to recursively break the input into smaller and smaller pieces until some threshold is reached, and then switch strategies to a sorting algorithm that is more efficient on those small inputs.

The following pseudocode describes this alternate approach:

merge_sort(sub-array)
    If sub-array has fewer than MERGE_SORT_THRESHOLD entries:
        Sort the sub-array with insertion sort. 
    Otherwise: 
        Recursively merge_sort the left half
        Recursively merge_sort the right half
        Merge the two sorted halves.

Note that we have provided an implementation of insertion sort, insertionSubsort, that works on a sub-array for this part of the assignment.

Choosing an appropriate value for MERGE_SORT_THRESHOLD requires some experimentation. One of the requirements of this assignment is that you select an appropriate threshold value and provide data to justify your choice. Simply selecting the crossover point based on the figure above is not a reasonable strategy. The point where the two lines cross represents the input size where the two sorts are equally fast. Switching strategies at that input size will result in no speedup at all.

The only way to determine an appropriate choice for MERGE_SORT_THRESHOLD is to systematically experiment with different values until you find the one that leads to the best overall sorting performance. As a starting point, you can experiment with the following driver method that iterates over different values of MERGE_SORT_THRESHOLD and uses SortProfiler to evaluate each.

public class ProfileDriver {

  public static void main(String[] args) {

    // Configuration parameters
    int startSize = 20000;
    int sizeInterval = 20000;
    int maxSize = 100000;
    int numTrials = 20;

    // Print CSV header
    System.out.print("Threshold");
    for (int size = startSize; size <= maxSize; size += sizeInterval) {
      System.out.print(",Size_" + size);
    }
    System.out.println();

    // Test different threshold values
    for (int threshold = 5; threshold <= 150; threshold += 5) {
      MergeSortImproved.MERGE_SORT_THRESHOLD = threshold;

      // Create a SortProfiler object
      SortProfiler sp = new SortProfiler(List.of("merge_adaptive"), "random", startSize,
          sizeInterval, maxSize, numTrials);

      // Run the profiler and get timing data
      double[][] timingData = sp.run();

      // Print CSV row: threshold value followed by timing data for each size
      System.out.print(threshold);
      for (int sizeIndex = 0; sizeIndex < timingData[0].length; sizeIndex++) {
        System.out.printf(",%.8f", timingData[0][sizeIndex]);
      }
      System.out.println();
    }
  }
}

Part 3 - Introspective Sort

As mentioned above, quicksort is usually very fast. There are relatively few pathological inputs that result in \(\Theta(n^2)\) performance. The idea behind introspective sort is to perform a standard quick sort until there is evidence that the current input is pathological. If it looks like the current input will cause \(\Theta(n^2)\) performance, the algorithm switches strategies to a sorting algorithm with guaranteed \(O(n \log n)\) performance. Typically, heap sort is used as the alternative, since it is in-place and takes \(\Theta(n \log n)\) time in the worst case. Since we haven't yet studied heap sort, your version of introspective sort should use the improved merge sort from Part 2 as the alternate sorting algorithm. The resulting algorithm is almost as fast as quick sort for most inputs, but has a worst case performance of \(\Theta(n \log n)\).

It is possible to detect pathological inputs by tracking the recursion depth of the quicksort algorithm. When quicksort is working well, the partition operation typically moves the pivot item somewhere near the center of the current sub-sequence. When this is the case, the maximum recursion depth will be \(\Theta(\log n)\). Therefore, introspective sort switches strategies when the recursion depth exceeds \(2 \lfloor \log_2 n\rfloor\).

There are a number of possible variations on the basic Introsort algorithm. Your implementation should conform to the following pseudocode:

introspective_sort(array)
    introspective_sort(array, 2 ⌊log₂(array.length)⌋)

introspective_sort(sub-array, depth-limit)
    If depth-limit is 0
        merge_sort(sub-array)
    Otherwise
        If sub-array has 0 or 1 entries
            return
        Otherwise
            pivot ← partition(sub-array)
            introspective_sort(sub-array[0 to pivot - 1],
                               depth-limit - 1) 
            introspective_sort(sub-array[pivot + 1 to end],
                               depth-limit - 1)

Note that the provided version of merge sort sorts a full array. Introspective sort needs to be able to call a version of merge sort that can sort a portion of a provided array. The MergeSortImproved class provides a declaration for such a method.

Part 4 - Experimental Analysis

So far in this course we have focused on asymptotic analysis. This usually gives us the information we want. When deciding between two algorithms we generally prefer the one with better asymptotic performance.

Our goals for this project are somewhat different. The focus now is on fine-tuning existing algorithms to improve their performance. This is a situation where we really care about the constant factors. (We would be very happy if we could write a version of merge sort that is twice as fast, even if the asymptotic worst-case performance were still \(\Theta(n \log n)\).)

In order to help you evaluate your sorting improvements we are providing a command-line tool that can be used to systematically evaluate the run-time performance of different sorting algorithms.

Running the tool with no arguments should produce the following help message:

Option (* = required)         Description                                   
---------------------         -----------                                   
* -s <Integer: NUMBER>        Starting (smallest) input size                
* -i <Integer: NUMBER>        Input size increment                          
* -m <Integer: NUMBER>        Maximum input size to test                    
* -t <Integer: NUMBER>        Number of trials for each input size          
-w [String: SORT1,SORT2,...]  Comma separated list of sorts. Options include
                                insertion, selection, merge, merge_half,    
                                merge_adaptive, quick, introspective and    
                                timsort. Default is to execute all sorts.   
-g [String: GENERATOR]        Sequence generator. Options include random,   
                                ordered or evil. The default is random

For example, we could compare the performance of quicksort and introspective sort on small inputs by executing the tool as follows:

$ java SortProfiler -s 1000 -i 1000 -m 10000 -t 10 -w quick,introspective

The resulting output would look something like the following:

N,  quick,  introspective
1000,   0.00009277, 0.00000085
2000,   0.00021152, 0.00000086
3000,   0.00030850, 0.00000088
4000,   0.00043710, 0.00000090
5000,   0.00055941, 0.00000088
6000,   0.00068144, 0.00000087
7000,   0.00081457, 0.00000091
8000,   0.00095075, 0.00000087
9000,   0.00105827, 0.00000089
10000,  0.00117701, 0.00000087

(Introspective sort is very fast here because it hasn't been implemented yet.)

These results are in a comma-separated value (CSV) format. This is essentially a table of values. The first row is the header row, and the subsequent rows are the data. You can see how the first column represents N, followed by the running time, in seconds, for each sorting algorithm at that input size.

CSV data can be easily imported into a spreadsheet program like Microsoft Excel, Google Sheets, or even Desmos. You can then use the spreadsheet program to create graphs that compare the performance of the different sorting algorithms.

For example, you can copy the CSV data (including headers), paste it into Google Sheets, and then click "Split data into columns" to turn it into a table. In Excel, you can use the "Paste > Use Text Import Wizard" to do the same thing.

Afterwards, you will need to plot your results into an interpretable chart/graph. You should be able to figure out how to do this for your respective spreadsheet program.

You will use your CSV output to prepare a short document that quantifies the performance improvements you were able to achieve. This document should take the form of the following figures:

Figure 1 - Improved Merges. This figure should compare the performance of mergeSortHalfSpace to the original merge sort implementation as well as quicksort. Results should be shown for both random and ordered inputs. To avoid clutter, you should plot the results for random inputs and ordered inputs separately.
Figure 2 - Threshold Selection. This figure should show the data you used to select an appropriate threshold for switching strategies in mergeSortAdaptive.
Figure 3 - Switching Strategies. This figure should compare your tuned mergeSortAdaptive to the previous two merge sort implementations as well as quicksort. Results should be shown for both random and ordered inputs.
Figure 4 - Introspective Sort (Random Inputs). This figure should compare introspective sort to quicksort for randomly generated inputs.
Figure 5 - Introspective Sort (Pathological Inputs). This figure should compare introspective sort to quicksort for worst-case quicksort inputs.

Each figure must have clearly labeled axes, a caption, and appropriate legends. There must be a paragraph describing the results illustrated in each figure. The description must provide enough information for the reader to understand the point of the experiment and how the data supports the conclusion.

Keep in mind that we are interested in the running times for these sorting algorithm as the input size gets big. "Big" is a relative term, but you should be showing results up to, at least, N=50,000.

Here are some issues to keep in mind as you gather and prepare your experimental results:

You should adjust the Y-axis range to easily see the entirety of the data.
If you are comparing two algorithms, they should be plotted on the same graph (otherwise, it is difficult to compare them).
- Make sure to use different colors or symbols to differentiate between the two algorithms.
- You should also include a legend to indicate which color/symbol corresponds to which algorithm.
You may need to experiment with the -t flag in order to obtain clean and consistent experimental results. The time it takes for each individual sort can be influenced by a number of unpredictable factors, such as the particular sequence of items being sorted or other processes running on the computer. By increasing the number of trials, you can average over many sorts to reduce noise and produce a more stable performance estimate.
If you observe unusual or inconsistent timing results on Windows, try running your profiling on the lab computers instead. Some versions of Microsoft Windows do not provide the high-resolution timer accuracy required for this project.

Submission

Submit the following five files through Gradescope:

MergeSortImproved.java
IntrospectiveSort.java
SortAnalysis.pdf - The document containing your analysis

Test your code locally at each stage of development, and reason about its correctness. When you are reasonably confident in your solution, submit your code to Gradescope to run the autograder's unit tests.

Grading

Gradescope Tests for Improved Merge	2 points
Gradescope Tests for Switching Strategies	2 points
Gradescope Tests for Introspective Sort	2 points
Experimental Results Document	3 points
Gradescope Style Checks	1 points

The grading of your evaluation document will be based on the quality of the figures as well as the clarity of your writing.

^*The second approach has recently become popular. Both Python (starting in version 2.3) and Java (starting in version 7) use Timsort as their default sorting algorithm. Timsort is a modified version of merge sort that includes several enhancements: it uses a more space-efficient merge operation, it takes advantage of partially sorted arrays by finding and merging existing sorted regions, and it uses a non-recursive binary insertion sort to handle short unsorted regions.