Hashmaps
An Introduction |
Prof. David Bernstein
|
Computer Science Department |
bernstdh@jmu.edu |
int
indexint
and aren't
sequentialint
Values: | Homogeneous elements of any type | |||||||||||||||||||||
Operations: |
|
In addition, one could add an isEmpty()
operation, an
isFull()
operation (if there is a size limit), a
remove()
operation, and a
makeEmpty()
operation.
Perfect | Each key is converted into a unique integer. |
Imperfect | Multiple keys are converted into the same integer. |
Minimal Perfect | In addition to being perfect, the set of \(n\) keys is converted into the set of integers \(\{ 0, 1, 2, ... n-1 \}\) |
An Example: We need to "index" a collection of books written by Sue Grafton because we are developing an "instant printing" service that charges by the page
Title | Pages |
A Is for Alibi | 214 |
B Is for Burglar | 211 |
C Is for Corpse | 212 |
D Is for Deadbeat | 239 |
E Is for Evidence | 199 |
F Is for Fugitive | 308 |
G Is for Gumshoe | 341 |
H Is for Homicide | 287 |
I Is for Innocent | 343 |
J Is for Judgment | 376 |
K Is for Killer | 307 |
L Is for Lawless | 322 |
M Is for Malice | 337 |
N Is for Noose | 322 |
O Is for Outlaw | 318 |
We would like a hash function that converts each of these titles into a unique integer in the closed interval [0, 15]. One such function is:
Cichelli, R.J. (1980) "Minimal Perfect Hash Functions Made Simple", Communications of the ACM, Vol. 23, pp. 17-19.
Czech, Z.J. and Majewski, B.S. (1993) "A Linear Time Algorithm for Finding Minimal Perfect Hash Functions", Computer Journal, Vol. 36, pp. 579-587.
Fox, E.A., Heath, L.S., Lenwood, S., Chen, Q.F. and Daoud, A.M. (1992) Practical Minimal Perfect Hash Functions for Large Databases, Communications of the ACM, Vol. 35, pp. 105-121.
Haggard, G. and Karplus, K. (1986) "Finding Minimal Perfect Hash Functions", SIGCSE Bulletin, Vol. 18, pp. 191-193.
An Example: We have a collection of holidays and we want to "index them" by date (where each date is in the form MMDD). We could use MM*100 + DD but...
The Date Example (cont.)
This is an imperfect hash function because more than one key can generate the same integer. For example, March 10 hashes to 29. and October 3 also hashes to 29.
where &
is the bitwise logical AND operator,
^
is the bitwise XOR opertor, ~
is the bitwise NOT operator, >>
is the
shift right operator, and <<
is the
shift left operator.
%
operator) until an empty
slot is found\(A_2\) | \(A_3\) | \(A_5\) |
\(A_2\) | \(A_3\) | \(A_5\) | \(B_5\) |
\(A_2\) | \(A_3\) | \(B_2\) | \(A_5\) | \(B_5\) |
\(A_2\) | \(A_3\) | \(B_2\) | \(A_5\) | \(B_5\) | \(C_2\) |
Linear Probing | Double Hashing | |||
\(a\) | Successful | Unsuccessful | Successful | Unsuccessful |
0.10 | 1.1 | 1.1 | 1.1 | 1.1 |
0.20 | 1.1 | 1.3 | 1.1 | 1.2 |
0.30 | 1.2 | 1.5 | 1.2 | 1.4 |
0.40 | 1.3 | 1.9 | 1.3 | 1.7 |
0.50 | 1.5 | 2.5 | 1.4 | 2.0 |
0.60 | 1.8 | 3.6 | 1.5 | 2.5 |
0.70 | 2.2 | 6.1 | 1.7 | 3.3 |
0.80 | 3.0 | 13.0 | 2.0 | 5.0 |
0.90 | 5.5 | 50.5 | 2.6 | 10.0 |
Number | Closest Prime |
100 | 97 |
250 | 241 |
500 | 499 |
1000 | 997 |
1500 | 1499 |
2000 | 1999 |
5000 | 4999 |
7500 | 7499 |
10000 | 9973 |