User Tools

Site Tools


hash:efficiency-en

Hash Function Efficiency

(This article is also available in Russian)

In article of Arash Partow "General Purpose Hash Function Algorithms" several 32-bit algorithms are reviewed:

  • rs — simple hash function from Robert Sedgwicks book 'Algorithms in C'
  • js — bitwise hash function by Justin Sobel
  • pjw — algorithm based on work by Peter J. Weinberger
  • bkdr — hash function from Brian Kernighan and Dennis Ritchie's book 'The C Programming Language'
  • sdbm — algorithm of choice used in SDBM project
  • djb — algorithm produced by Professor Daniel J. Bernstein
  • dek — algorithm proposed by Donald E. Knuth in 'The Art Of Computer Programming'
  • ap — algorithm produced by Arash Partow

Another five variants:

Here are the C sources.

Test 1

To measure the efficiency of hash functions I prepared the following test data:

Total volume after merging is 326797 different words.

For every word a 32-bit hash value was computed, and counted a number of collisions.

Algorithm Collisions
rs 9
js 98
pjw 1315
bkdr 11
sdbm 14
djb 83
dek 308
ap 16
ly 9
rot13 7
faq6 14
lookup3 9
crc32 13

A list of collisions for rs, bkdr, sdbm, ap, ly, rot13, faq6, lookup3 and crc32 is available here.

Algorithms with minimal collisions:

  • rot13 — one circular shift (rotation) and addition
  • lookup3 — one shift and addition (or more)
  • ly — one multiplication and addition
  • rs — two multiplications

Results are presented on picture. Two outsiders — pjw and dek — exceed the limits of Y axis.


Test 2

In previous test, all data had MSB unchanged. For the second test another data set was selected:

Total volume after merging is 310595 different words.

Algorithms js, pjw, djb и dek were excluded from testing.

Algorithm Collisions
rs 5
bkdr 14
sdbm 14
ap 32
ly 8
rot13 10
faq6 13
lookup3 14
crc32 15

Algorithms with minimal collisions:

  • rs — two multiplications
  • ly — one multiplication and addition
  • rot13 — one circular shift (rotation) and addition

Results are presented on picture.

hash/efficiency-en.txt · Last modified: 2022/05/15 22:24 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki