Re: [PATCH] Replace hashing algorithm with murmurhash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/25/2012 09:38 AM, Daniel P. Berrange wrote:
>>> -static unsigned long virCgroupPidCode(const void *name)
>>> +static int32_t virCgroupPidCode(const void *name, int32_t seed)
>>>  {
>>> -    return (unsigned long)name;
>>> +    unsigned long pid = (unsigned long)name;
>>> +    return virHashCodeGen(&pid, sizeof(pid), seed);
>>
>> I'm not sure if it matters, but you could shorten these two lines to
>> just one:
>>
>> return virHashCodeGen(&name, sizeof(unsigned long), seed);
> 
> I actually preferred the slightly more verbose style, to make
> it clearer that we're actually passing in the PID as an int,
> cast to a pointer, instead of an int pointer.

And given the recent thread about mingw32, this is wrong anyways.
There, 'unsigned long' is 32 bits, but pid_t is 64 bits, so we'd be
losing information in our hash.  We probably need to fix this code to be
passing (void *)&pid_t and dereferencing the pointer to get to a full
pid, rather than cheating with (void*)(unsigned long)pid_t and possibly
losing bits (although since this is in a context of hashing, lost bits
only means more chance for collision, and not a fatal algorithmic flaw).

>>
>> where existing calls to virRandom(256) will now be virRandomBits(8),
>> virRandom(1024*1024*1024) will now be virRandomBits(30), and your code
>> would use uint32_t seed = virRandom(32).

Of course, here I meant virRandomBits(32),

> 
> Yep, I've done that.

but I'm sure you figured that out.

>>> +
>>> +    return le32toh(r);
>>
>> <endian.h>, and thus le32toh(), is not yet standardized (although POSIX
>> will be adding it in the future), nor is it currently provided by
>> gnulib.  We'd have to get that fixed first.
> 
> The le32toh call was only here because the code I copied wanted to be
> endian neutral. I don't think libvirt really cares if its hash codes
> are endian neutral, so I trivially just removed the le32toh call and
> avoid the problem of endian.h

Agreed - we aren't sharing hash values over the wire, so all hash values
within a particular libvirtd process will be the same endianness,
without having to waste time on swapping bytes around.

-- 
Eric Blake   eblake@xxxxxxxxxx    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]