Re: [RFC]: Pack-file object format for individual objects (Was: Revisiting large binary files issue.)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ I read my personal mailbox first, so I didn't see this one until after I 
  had already written my version.. ]

On Tue, 11 Jul 2006, sf wrote:
> 
> I just stumbled over the same fact and asked myself why there are still
> two formats. Wouldn't it make more sense to use the pack-file object
> format for individual objects as well?

Yes, see the git list for a series of patches that try to do this.

> As it happens individual objects all start with nibble 7 (deflated with
> default _zlib_ window size of 32K) whereas in the pack-file object
> format nibble 7 indicates delta entries which never occur as individual
> files.

I didn't actually do it that way, but it would be better to make the 
"parse_ascii_sha1_header()" more strict, and only accept the old names. 

Right now my patch-series could in theory accept something that is _not_ 
an ASCII header (eg it would be a binary header that just happened to have 
the format "x n\0", where "n" was a valid number).

> Step 1. When reading individual objects from disk check the first nibble
> and decode accordingly (see above).

Check more than that, but yes, this should be tightened up in my 
series.

> Step 2. When writing individual objects to disk write them in pack-file
> object format. Make that optional (config-file parameter, command line
> option etc.)?

Done.

> Step 3. Remove code for (old) individual object disk format.

Well, I'm not sure how necessary that even is. We actually do have to 
generate the old header regardless, if for no other reason than the fact 
that we generate the SHA1 names based on it (even if we then write a 
new-style dense binary header to disk and discard the ASCII header).

Having it there means that you can always just get a new version of git, 
and never worry about how old the archive you're working with is.

(And then doing a "git repack -a -d" will make any archive also work with 
an old-style git, since the pack-file format didn't change, and a "git 
repack" thus ends up always creating something that is readable by 
anybody, including old clients).

		Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]