Applications
The goal of this activity is to apply all of the techniques and structures we've studied this semester to solve particular problems. You should read the problem descriptions, determine appropriate data structures, implement efficient solutions in psuedocode, and analyze your solutions. You should work in groups on this activity.
IMPORTANT: Do not "re-invent the wheel!" None of these problems require extensive custom data structures; they can all be implemented very simply by leveraging the structures we've implemented or discussed in class.
As a reminder, here are the major data structure implementation categories we've discussed this semester:
- Contiguous
- (Dynamic) Arrays
- Heaps
- Hash tables
- Linked
- Linked lists
- Skip lists
- (Binary) Trees
Here is a list of ADTs that we have implemented using the techniques above:
- List
- Stack
- Queue
- Priority Queue
- Set
- Sorted Set
- Map
- Sorted Map
To solve the tasks in this activity, we recommend focusing your attention on the following data structures:
- Dynamic Array List
- Linked List Queue
- Linked List Stack
- Hash Set/Map
- Skip List Sorted Set/Map
For the purposes of this activity, you may ignore memory leaks, although you should be aware that if you were to actually code up any of these structures in C, you would need to pay attention to memory allocation issues.
Check-in Manager
You have been selected to build a simple mobile check-in app for next year's Harrisonburg Holiday Parade. Each entrant in the parade has been assigned an number in the order in which they will appear in the parade, starting with zero (0). Your app needs to provide the capability for entrants to check in once they are ready, and provide parade organizers the ability to see who has checked in.
Here is the interface you must implement:
init(int total_entrants)
Initialize new check-in manager with the given total number of entrants.
check_in(int number)
Add a new checked-in entrant.
is_checked_in(int number)
Returns true if the given entrant has checked in already; false otherwise.
all_ready()
Returns true if all entrants have checked in; false otherwise.
All operations should be efficient, but the initialization only happens once so it's ok if it's more expensive than the other operations.
Duplicate Visitor Tracker
You are in charge of a cool new website that helps JMU students to organize student events and to prevent club meeting clashes. It has only begun to roll out, however, and you know that many students who visit the website once never return. You'd like to start looking for patterns among the students that do return for a second time.
Thus, you've decided to create a system that will keep track of repeat visitors for further analysis. Further, in order to facilitate batch eID lookups, you need to find repeat visitors in sorted order. Because this code will run live on the website server, it needs to be very efficient.
Here is the interface you must implement:
init()
Initialize new duplicate visitor tracker.
add_id(id)
Add a new eID to the tracker.
get_duplicates()
Return a sorted list of all duplicate eIDs seen so far.
Both the add_id
and the get_duplicates
functions may be called arbitrarily many times and in any order: efficiency matters for both.
Example:
init()
add_id("smithja")
add_id("doejb")
add_id("lam2mo")
add_id("doejb")
add_id("smithja")
add_id("smithja")
print(get_duplicates()) // should be ["doejb", "smithja"]
add_id("lam2mo")
print(get_duplicates()) // should be ["doejb", "lam2mo", "smithja"]
Most Frequent Visitor Tracker
Continuing from the website scenario in the previous section, another thing you'd like to know at any given point is the identity of your website's most devoted user (i.e., the user who has visited it the most).
Here is the interface that you must implement:
init()
Initialize a new frequent visitor tracker.
add_id(id)
Add a new eID to the tracker.
get_most_frequent()
Return the most frequently-seen eID so far.
Both the add_id
and the get_most_frequent
functions may be called arbitrarily many times and in any order: efficiency matters for both.
Example:
init()
add_id("smithja")
add_id("smithja")
add_id("doejb")
print(get_most_frequent()) // should be "smithja"
add_id("doejb")
add_id("doejb")
print(get_most_frequent()) // should be "doejb"
Job Sequencer
You are in charge of implementing a scheduler for a large computational cluster that is used by many researchers from several different universities and research labs. The problem is that these jobs are not all similar; some are CPU-bound, some are memory-bound, some are low-priority, some are urgent, some must be run together, etc.
To deal with this situation, you have categorized each of these needs into a job "type" and you require every person who submits a job to tag it with a specific job type. Jobs with the same type should be handled in first come, first served order. You have also written monitoring software that runs in the background and determines when it is appropriate to run a new job of a given type (assuming one is available).
Now, you must write the system that actually performs the bookkeeping in order to implement scheduling. The cluster handles thousands of jobs per day and is somehow always backlogged, so the scheduling process must be very efficient. Further, you may in the future add or remove job types, so your solution should not assume a fixed number of types.
Here is the interface that you must implement:
init()
Initialize a new job sequencer.
add_job(type, id)
Add a new job with the given type and ID to the scheduling system.
next_job(type)
Return the next job to be processed for the given job type.
You should assume that both the add_job
and the next_job
functions may be called arbitrarily many times and in any order: efficiency matters for both. If no job is available, the next_job
method should return NULL.
Example:
init()
add_job("CPUBound", 101)
add_job("MemBound", 105)
add_job("CPUBound", 93)
add_job("MemBound", 202)
print(next_job("CPUBound")) // should return 101
print(next_job("MemBound")) // should return 105
print(next_job("MemBound")) // should return 202
print(next_job("CPUBound")) // should return 93
print(next_job("CPUBound")) // should return NULL