Re: [PATCH 07/11] object: try naive cuckoo hashing

"George Spelvin" <linux@xxxxxxxxxxx> · 13 Aug 2011 15:12:06 -0400

I've been doing a lot of reading on Cuckoo hashing.

Yes, the single-table variant is described and used.  However, the
insertion procedure is not the way you do it.

Also, d-ary Cuckoo hashing (also called d-Cuckoo hashing) where you use
more than 2 hash functions is also used.

The insertion algorithm, however, is not really agreed on.

One algorithm (mostly proposed for hardware) uses d separate tables.
Every entry displaced from table i is displaced to table i+1 (mod d).
http://infoscience.epfl.ch/record/164147/files/cuckoo_dir_hpca2011_camera_ready.pdf

In the single-table case, and in general, however, a displaced entry
has more than one possible new location.  This leads to the question of how
to choose.

One proposal is to do a breadth-first search looking for a path to
a free slot.  It's provable that this will succeed with high probability
before the exponential growth of the breadth of the search tree
gets too bad.
See "Space Efficient Hash Tables with Worst Case Constant Access Time"
http://www.itu.dk/people/pagh/papers/d-cuckoo-jour.pdf

Another suggested tehcnique is to just pick an alternative at random
and proceed.  This is recommended in e.g.
"Efficient Hash Probes on Modern Processors"
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.67.1189&rep=rep1&type=pdf

Both of these lead to rather complex implementations.  The random number
generator is probably simpler than the breadth-first search, but either way
there's a bunch of auxiliary code.

Sticking with two hash functions, but using multi-entry buckets is
definitely an attractive possibility.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html