Re: Mozilla .git tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <junkio@xxxxxxx> wrote:
[snip]
> +A new style .idx file begins with a signature, "\377tOc", and a
> +version number as a 4-byte network byte order integer.  The version
> +of this implementation is 2.

Ick.  I understand why you did this (and thanks for such a good
explanation of it by the way) but what a horrible signature number
and way to wedge in a version number.

>From now on I propose that if we write a file - especially a binary
file - to disk that we always include a 4 byte version number into
the header of the file.  Then we can avoid trying to wedge this into
the file after the fact when some worthwhile improvement requires
changing the format.

I think we probably should have done this when the binary headers
were introduced into loose objects.  Would 4 bytes of file format
version number with an (initial version number which wasn't 0x78
in byte 0 and failed the % 31 test) really have hurt that much in
a loose object?

[snip]
> + . 4-byte network byte order integer, recording the location of the
> +   next object in the main toc table.

Why not just the 4 byte object entry length?  To load an object we
have to go find the next object in the idx file so we can compute
the offset difference.  On very large packs (e.g. the Mozilla pack)
the index is 46 MiB.  The jump across the index could be the entire
thing from back to front just to compute the size of an object when
the fan-out table and the binary search process really only poked
the tail end of the index when searching for the entry.  So we're
demand paging in the front of the index just to compute a length

Sure the scheme you outlined allows a 64 bit difference but
uncompressed objects already can't be larger than 2**32-1 and we
could just as easily move that limit down to say 2**32-16 to leave
room for the object header and zlib header.

-- 
Shawn.

-- 
VGER BF report: U 0.5
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]