Linear Probing

Impact of Load Factor on Cost

Dashed lines are linear probing, solid lines are “random” probing.
Dashed lines are linear probing, solid lines are “random” probing.

Load factor is on the x-axis, expected number of buckets accessed on the y-axis.

(From OpenDSA Data Structures and Algorithms book)

Solution #1: Quadratic Probing

Solution #2: Double Hashing

Problems/Challenges with Hashing

What Gets used In The Wild?

C#/.net Dictionary

Python 2

Implementation in: Python-2.7.2/Objects/dictobject.c. Source can be downloaded from www.python.org.

Python 3.6+

Update in Python 3:

the dictionary:

    d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}

is currently stored as:

    entries = [['--', '--', '--'],
               [-8522787127447073495, 'barry', 'green'],
               ['--', '--', '--'],
               ['--', '--', '--'],
               ['--', '--', '--'],
               [-9092791511155847987, 'timmy', 'red'],
               ['--', '--', '--'],
               [-6480567542315338377, 'guido', 'blue']]

Instead, the data should be organized as follows:

    indices =  [None, 1, None, None, None, 0, None, 2]
    entries =  [[-9092791511155847987, 'timmy', 'red'],
                [-8522787127447073495, 'barry', 'green'],
                [-6480567542315338377, 'guido', 'blue']]

Only the data layout needs to change.  The hash table
algorithms would stay the same.

descripion: https://mail.python.org/pipermail/python-dev/2012-December/123028.html

source: https://github.com/python/cpython/blob/main/Objects/dictobject.c

Ruby

Original source: https://github.com/ruby/ruby/blob/ruby_2_3/st.c Updated source: https://github.com/ruby/ruby/blob/ruby_3_0/st.c

Java 7 (We’ve seen…)

source (GPL V2.0)

Java 8+

Java 16 Source: https://github.com/openjdk/jdk/blob/jdk-16+36/src/java.base/share/classes/java/util/HashMap.java