Mostly fun stuff
Definition: An improvement over Direct Access Tables (big arrays, e.g. storing all 10 digits phone numbers in an array). Use a hash function to convert a key to a smaller number and use the smaller number as index in a table.
Complexity will be O(1)
on average for all operations, but as the result of managing collisions, the hash table may have been filled in such a way that operations are O(n)
.
It should:
Chances of a collision with a large table: Very likely (birthday pparadox: with 23 persons, probability of two persons having the same birthday == 50%).
Separate Chaining
Each cell points to a linked list of records that have the same hash.
Pros:
Cons:
O(n)
.Open Adressing
To insert a new item we keep probing (fr: “Sonder”) until an empty slot is reached.
Different probing functions:
h[x]
full then try h[x] + 1
. Problem: Clustering, many consecutive elements form groups and search is expensive. Advantage: Good caching performance, easy to compute.h[x]
full then try h[x] + 1 * 1
and then h[x] + 2 * 2
.