On Wed, Jan 18, 2012 at 11:34:16AM -0700, Eric Blake wrote: > On 01/18/2012 09:23 AM, Daniel P. Berrange wrote: > > @@ -1636,9 +1637,10 @@ cleanup: > > } > > > > > > -static unsigned long virCgroupPidCode(const void *name) > > +static int32_t virCgroupPidCode(const void *name, int32_t seed) > > { > > - return (unsigned long)name; > > + unsigned long pid = (unsigned long)name; > > + return virHashCodeGen(&pid, sizeof(pid), seed); > > I'm not sure if it matters, but you could shorten these two lines to > just one: > > return virHashCodeGen(&name, sizeof(unsigned long), seed); I actually preferred the slightly more verbose style, to make it clearer that we're actually passing in the PID as an int, cast to a pointer, instead of an int pointer. > > @@ -101,11 +95,10 @@ static void virHashStrFree(void *name) > > VIR_FREE(name); > > } > > > > - > > static unsigned long > > virHashComputeKey(virHashTablePtr table, const void *name) > > { > > - unsigned long value = table->keyCode(name); > > + uint32_t value = table->keyCode(name, table->seed); > > return (value % table->size); > > If, on a 64-bit machine, table->size ever grows larger than uint32_t, > then we've lost some hashing abilities compared to what the old unsigned > long hash gave us. Conversely, if table-size ever grows that large, > we're burning an awful lot of memory in the first place, and a hash > collision at that point is the least of our worries :) I think this is > okay, although you may want to consider changing virHashComputeKey to > return uint32_t. I changed the signature here to actually be 'size_t', likewise for the table->size variable itself. > > +++ b/src/util/hash.h > > @@ -55,12 +55,15 @@ typedef int (*virHashSearcher) (const void *payload, const void *name, > > /** > > * virHashKeyCode: > > * @name: the hash key > > + * @seed: random seed > > * > > - * Compute the hash code corresponding to the key @name > > + * Compute the hash code corresponding to the key @name, using > > + * @seed to perturb the hashing algorithm > > * > > * Returns the hash code > > */ > > -typedef unsigned long (*virHashKeyCode)(const void *name); > > +typedef int32_t (*virHashKeyCode)(const void *name, > > + int32_t seed); > > Umm, you used uint32_t in virHashComputeKey, but int32_t here. While I > don't _think_ signed values will bite us due to sign-extension in 64-bit > operations introducing a bias, I'd rather make this signature: > > typedef uint32_t (*virHashKeyCode)(const void *name, uint32_t seed); > > and adjust the callers to comply with unsigned everywhere. Yeah, that was a mistake - it should have been uint32. > That is, I propose that we ditch virRandom, and instead implement a new > function: > > /* Return an evenly distributed random number between [0,2^nbits), where > nbits must be in the range (0,64]. */ > uint64_t virRandomBits(int nbits) > { > int bits_per_iter = count_one_bits(RAND_MAX); > uint64_t ret = 0; > int32_t bits; > /* This algorithm requires that RAND_MAX == 2^n-1 for some n; > gnulib's random_r meets this property. */ > verify(((RAND_MAX + 1U) & RAND_MAX) == 0); > > virMutexLock(&randomLock); > > while (nbits > bits_per_iter) { > random_r(&randomData, &bits); > ret = (ret << bits_per_iter) | (bits & RAND_MAX); > nbits -= bits_per_iter; > } > > random_r(&randomData, &bits); > ret = (ret << nbits) | (bits & ((1 << nbits) - 1)); > > virMutexUnlock(&randomLock); > return ret; > } > > where existing calls to virRandom(256) will now be virRandomBits(8), > virRandom(1024*1024*1024) will now be virRandomBits(30), and your code > would use uint32_t seed = virRandom(32). Yep, I've done that. > > +#include "virhashcode.h" > > + > > +static uint32_t rotl(uint32_t x, int8_t r) > > +{ > > + return (x << r) | (x >> (32 - r)); > > +} > > Gnulib provides the bitrotate module, which provides "bitrotate.h" and > an implementation of this function that is guaranteed to be as efficient > as possible when compiled by gcc (possibly by using compiler-builtins to > access raw assembly on architectures that have builtin rotate > instructions). I'd suggest using that rather than rolling your own. Good point. > > > + > > +/* slower than original but is endian neutral and handles platforms that > > + * do only aligned reads */ > > +__attribute__((always_inline)) > > +static uint32_t getblock(const uint8_t *p, int i) > > +{ > > + uint32_t r; > > + size_t size = sizeof(uint32_t); > > + > > + memcpy(&r, &p[i * size], size); > > + > > + return le32toh(r); > > <endian.h>, and thus le32toh(), is not yet standardized (although POSIX > will be adding it in the future), nor is it currently provided by > gnulib. We'd have to get that fixed first. The le32toh call was only here because the code I copied wanted to be endian neutral. I don't think libvirt really cares if its hash codes are endian neutral, so I trivially just removed the le32toh call and avoid the problem of endian.h Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list