On Thu, Jan 31, 2013 at 6:06 PM, Duy Nguyen <pclouds@xxxxxxxxx> wrote: > On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote: >> Perhaps we could store abbrev sha-1 instead of full sha-1. Nice >> space/time trade-off. > > Following the on-disk format experiment yesterday, I changed the > format to: > > - a list a _short_ SHA-1 of cached commits > .. > > The length of SHA-1 is chosen to be able to unambiguously identify any > cached commits. Full SHA-1 check is done after to catch false > positives. For linux-2.6, SHA-1 length is 6 bytes, git and many > moderate-sized projects are 4 bytes. And if we are going to create index v3, the same trick could be used for the sha-1 table in the index. We use the short sha-1 table for binary search and put the rest of sha-1 in a following table (just like file offset table). The advantage is a denser search space, about 1/4-1/3 the size of full sha-1 table. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html