CS 240: Algorithms and Data Structures
James Madison University, Fall 2024

Hybrid Deque

Introduction

The Double Ended Queue, or Deque (pronounced “deck”), is an abstract data type that allows elements to be appended and removed from either end of a sequence. The Java collections framework includes a Deque Interface along with several implementing classes. These include the ArrayDeque, (implemented using a circular dynamic array), and LinkedList (implemented using a doubly-linked list).

Note that the Deque is not a widely used abstract data type. In most cases it makes more sense to use either a Stack or a Queue, depending on the application. The most commonly cited example of an application that does require a Deque is the work stealing algorithm.

The goal of this assignment is to develop a new Deque implementation that satisfies the following design requirements:

Neither ArrayDeque nor LinkedList satisfy both of these requirements. While the ArrayDeque provides amortized \(\Theta(1)\) append and remove operations, the need to resize the underlying array means that appending will occasionally require \(\Theta(n)\) steps. The LinkedList does provide \(\Theta(1)\) append and remove in the worst case, but it is quite inefficient in terms of space. The need to store backward and forward references for each item stored means that as much as 2/3 of the required memory is “wasted” overhead.

Hybrid Deque Data Structure

For this project we will develop a HybridDeque class that combines some of the advantages of contiguous and linked data structures. Our implementation is inspired by the Python deque collection. The following description is taken from the internal documentation for that class:

Data for deque objects is stored in a doubly-linked list of fixed length blocks. This assures that appends or pops never move any other data elements besides the one being appended or popped.

Textbook implementations of doubly-linked lists store one datum per link, but that gives them a 200% memory overhead (a prev and next link for each datum) and it costs one malloc() call per data element. By using fixed-length blocks, the link to data ratio is significantly improved and there are proportionally fewer calls to malloc() and free(). The data blocks of consecutive pointers also improve cache locality.*

Essentially, our Deque implementation will be a doubly-linked list where each list node (or “block”) contains an array of data elements:

Starter Code

The following starter code is provided:

Cursor Class

One of the challenges in implementing the HybridDeque class is that it requires two pieces of information to specify a location in the Deque: a block reference and an index into that block. The implementation may be simplified by creating a private inner class that encapsulates both pieces of information and handles the logic of stepping forward and backward. Feel free to make use of the provided inner Cursor class if you would like to take this approach.

Drawing Pictures

This assignment will be impossible if you don’t develop a good mental model of how the data structure is organized. The readiness quiz will require that you take some time to draw the memory layout for some sample HybridDeque objects before you start coding. You will be asked to draw a HybridDeque at each of the indicated points in the sample code below.

HybridDeque.setBlockSize(8);
HybridDeque<String> deque = new HybridDeque<>();

// Draw the deque now... (This one is done for you.)

deque.offerLast("A");

// Draw the deque now...

deque.offerLast("B");
deque.offerLast("C");
deque.offerLast("D");
deque.offerLast("E");

// Draw the deque now...

deque.pollFirst();
deque.pollFirst();
deque.pollFirst();
deque.pollFirst();

// Draw the deque now...

Here is an example of what the correct memory diagram would look like for the initial step above:

Part 1: O(1) Methods

Part 1 of the project involves implementing all HybridDeque methods that do not involve removing elements from the middle of the collection.

Part 2: Mid-Deque Removal

While the Deque interface is primarily designed to allow access to elements at the ends of the sequence, there are a few methods that allow an item to be removed from the middle. These are removeLastOccurance, removeFirstOccurance and the remove methods of the two iterators. These methods are relatively challenging to implement because removing an element from middle of the sequence requires shifting all subsequent elements left to fill the vacated space.

Some issues to keep in mind while implementing removal:

Implementation Advice

Smaller Block Sizes

The default block size in the provided code is 64. This follows the original Python implementation and represents a trade-off: A large block size will waste a significant amount of space for small collections that don’t fill an entire block. A small block size will increase the overhead required for forward and backward references when storing large collections.

While 64 represents a good block size in terms of space efficiency, it probably isn’t ideal for tracing and debugging your code during the implementation stage. Bugs are likely to pop up at block boundaries, and it is inconvenient if you need to step through dozens of operations before reaching a boundary. I suggest changing the block size to 4 or 8 while you work on your implementation. You should switch back to 64 when you submit.

Taking Advantage of AbstractCollection

AbstractDeque extends AbstractCollection, which includes default implementations for several methods in the Collection interface. We’ll be able to take advantage of some of these default implementations. For example, we don’t need to write a toString method, since the toString method defined in AbstractCollection will work perfectly well. Here is the description from the docstring:

Returns a string representation of this collection. The string representation consists of a list of the collection’s elements in the order they are returned by its iterator, enclosed in square brackets (“[]”). Adjacent elements are separated by the characters “,” (comma and space). Elements are converted to strings as by String.valueOf(Object).

This is exactly the behavior we want, so there is no point in implementing our own version of toString.

On the other hand, some of the default methods would work, but would be disastrous from the point of view of efficiency. For example, the clear method provided by AbstractCollection operates as follows:

This implementation iterates over this collection, removing each element using the Iterator.remove operation. Most implementations will probably choose to override this method for efficiency.

As the comment suggests, this would be a silly and inefficient way to clear the items from our HybridDeque.

You are free to take advantage of methods defined in AbstractCollection as long as they don’t negatively impact the efficiency of your implementation.

Existing Implementation

As we mentioned above, the design of the HybridDeque class is based on the C implementation of Python’s deque collection. You are welcome to examine to the existing C implementation as you work on your code. It probably isn’t a good idea to attempt a line-for-line translation from C to Java, but the C code might provide some helpful ideas.

Submission and Grading

The project will be graded as follows:

Gradescope Readiness Quiz 10%
Submission Tests Part 1 55%
Submission Tests Part 2 10%
Student JUnit Tests 10%
Gradescope Style Checks 5%
Instructor Style Points 10%

No late submissions will be accepted for the readiness quiz.

Note that we will not be providing copies of our Gradescope submission tests for this assignment. In order to get full credit for the testing portion of the grade, you must submit JUnit 5 tests that provide full line coverage of the HybridDeque class.

You will be limited to 10 free Gradescope submissions for this assignment. Your project grade will be reduced by 1% for each submission beyond 10.

Any submission that doesn’t pass all of the instructor unit tests will receive at most 75% of the points for that component of the grade.

The Gradescope submission tests will include checks for efficiency as well as correctness. In other words, if you code a \(\Theta(1)\) operation in such a way that it requires \(\Theta(n)\) time to execute, your implementation will be considered incorrect even if it produces the expected result.


* https://github.com/python/cpython/blob/v3.11.2/Modules/_collectionsmodule.c
https://docs.python.org/3.11/license.html