On Thu, Nov 15, 2018 at 09:27:42AM -0500, Devin Hussey wrote: > >From 502cbd61e386eb014035df66b032e90a9de926b7 Mon Sep 17 00:00:00 2001 > From: Devin Hussey <husseydevin@xxxxxxxxx> > Date: Thu, 15 Nov 2018 09:12:24 -0500 > Subject: [PATCH] Use a real non-cryptographic hash algorithm > dash previously used a modified version of the LoseLose algorithm. > LoseLose is a TERRIBLE algorithm. > It has terrible distribution, leaving many hash entries unused. > However, more importantly, it is incredibly easy to make a > collision; even something as simple as rearranging the letters. > For example. these all produce the hash 1652: > Hello > Holle > Hlleo > Hoell > Haxed\n > HASH\tBAD > Hater > Collsions in hash tables are very bad because it requires > looking through a linked list, and the benefits of O(1) time > in a hash table are completely lost. > The FNV-1a algorithm has comparable performance and code size > while having a much better collision rate. You do have a dependency chain with a multiplication per character, so it is less suitable for long strings. There are implementation techniques to avoid the dependency chain, but I'm not sure it's worth the trouble. > While there are some other faster and more resistant hashes out there, > namely xxHash, Murmur2, and CityHash, the benefits are minimal on > short strings and they expect a length argument, unlike FNV-1a which > simply uses null-termination, and they are not in the public domain > like FNV-1a. > [snip] > + hashval = (hashval ^ (unsigned char)*p++) * FNV_PRIME; > [snip] > + hashval = (hashval ^ (unsigned char)*p++) * FNV_PRIME; > [snip] > + hashval = (hashval ^ (unsigned char) *p++) + FNV_PRIME; I suppose this latter one should be * instead of + as well? -- Jilles Tjoelker