On Thu, May 27, 2010 at 11:05:54AM -0700, Shawn O. Pearce wrote: > Ilari Liusvaara <ilari.liusvaara@xxxxxxxxxxx> wrote: > > * Where algorithm in git_revpool_table__hash() is from? Since it appears to > > hash binary object IDs, wouldn't just simple sum/xor over words be sufficient > > (all SHA-1 output bits are very nearly independent). Or do you need to be > > compatible with some other implementation (doesn't appear so, because hash > > is computed differently depending on endianess)? > > If you need a hash value for a SHA-1, why not just cast the unsigned > char* to unsigned int* and load the first int as the hash code? > The output of SHA-1 is pretty evenly distributed, using the first > few bytes as an int should yield a sufficient distribution throughout > the hashtable. Yeah, With pseudorandom function[*], all ways of reducing the output to n bits are at most as good as just taking first n bits. But if reducing output to [0, m), the best way (distribution-wise, not speed-wise) is to take remainder of the whole value divided by m... [*] SHA-1 is not pseudorandom function, but for virtually all practical purposes it is. -Ilari -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html