(This article is also available in Russian)
In article of Arash Partow "General Purpose Hash Function Algorithms" several 32-bit algorithms are reviewed:
Another five variants:
Here are the C sources.
To measure the efficiency of hash functions I prepared the following test data:
Total volume after merging is 326797 different words.
For every word a 32-bit hash value was computed, and counted a number of collisions.
Algorithm | Collisions |
---|---|
rs | 9 |
js | 98 |
pjw | 1315 |
bkdr | 11 |
sdbm | 14 |
djb | 83 |
dek | 308 |
ap | 16 |
ly | 9 |
rot13 | 7 |
faq6 | 14 |
lookup3 | 9 |
crc32 | 13 |
A list of collisions for rs, bkdr, sdbm, ap, ly, rot13, faq6, lookup3 and crc32 is available here.
Algorithms with minimal collisions:
Results are presented on picture. Two outsiders — pjw and dek — exceed the limits of Y axis.
In previous test, all data had MSB unchanged. For the second test another data set was selected:
Total volume after merging is 310595 different words.
Algorithms js, pjw, djb и dek were excluded from testing.
Algorithm | Collisions |
---|---|
rs | 5 |
bkdr | 14 |
sdbm | 14 |
ap | 32 |
ly | 8 |
rot13 | 10 |
faq6 | 13 |
lookup3 | 14 |
crc32 | 15 |
Algorithms with minimal collisions:
Results are presented on picture.