Re: Changed path filter hash differs from murmur3 if char is signed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes:

> So...how do we proceed? I can see at least 2 ways:
>
>  - Decide that we're going to stick with the details of the existing
>    implementation and declare that "data" is always interpreted as signed.
>    In that case, I would put "signed" wherever necessary, rename the
>    function to something that is not "murmur3", and change the names of
>    byte1 etc. to indicate that they are not constrained to be less than or
>    equal to 0xff.
>
>  - Bump the version number to 2 and correct the implementation to
>    match murmur3 (so, "data" is unsigned). Then we would have to think of
>    a transition plan. One possible one might be to always reject version
>    1 bloom filters, which I'm personally OK with, but it may seem too
>    heavy a cost to some since in the perhaps typical case where a repo has
>    filenames restricted to 0x7f and below, the existing bloom filters are
>    still correct.

If path filter hashing were merely advisory, in the sense that if a
matching data is found, great, the processing goes faster, but if
not, we would get correct results albeit not so quickly, a third
option would be to just update the implementation without updating
the version number.  But we may not be so lucky---you must have seen
a wrong result returned quickly, which is not what we want to see.

But if I recall correctly we made the file format in such a way that
bumping the version number is cheap in that transition can appear
seamless.  An updated implementation can just be told to _ignore_
old and possibly incorrect Bloom filters until it gets told to
recompute, at which time it can write a correct one with a new
version number.  So I would prefer your "Bump the version number and
ignore the old and possibly wrong data".

Thanks.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux