Re: Distribution of longest common hash prefixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Tue, 3 Apr 2007, Shawn O. Pearce wrote:
> 
> Well, the other thing is those 2 commits at 9 bytes probably were
> not that way a year ago.  One of those might have only needed 8,
> and the other is newer, so now you need 9.

Well, neither of the the two objects at 9 bytes may not be (and probably 
aren't) commits and of the 32 8-nibble cases who knows how many are 
actually commits (probably none), so an 8-byte SHA1 is *probably* unique 
at least if you just look at commits.

Remove the "--objects" to find out.

> What the above tells me is that 8 is almost a safe default for our
> abbreviations, but isn't safe enough, as there are collisions past 8.

Yeah, the short SHA1 form is obviously always going to be risky. But in 
practice, since people almost always use it just for commits, it's 
probably good enough in practice, and even if you get a collision in 8 
nibbles, most of the time it will probably be trivial to figure out which 
one was meant, so it's not like it's a disaster if somebody ends up 
reporting a bug with a non-unique abbreviation.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]