On Nov 26, 2007 8:58 PM, Nicolas Pitre <nico@xxxxxxx> wrote: > On Mon, 26 Nov 2007, Shawn O. Pearce wrote: > > - Loose objects storage is difficult to work with > > > > The standard loose object format of DEFLATE("$type $size\0$data") > > makes it harder to work with as you need to inflate at least > > part of the object just to see what the hell it is or how big > > its final output buffer needs to be. > > It is a bit cumbersome indeed, but I'm afraid we're really stuck with it > since every object SHA1 depends on that format. Yes, now I remember: this was the same argument you used to convince me that losing the "new" (deprecated) loose format was OK. However, if we changed WRITE(DEFLATE(SHA1("$type $size\0$data"))) (where SHA1(x) = x but has the side-effect of updating the SHA-1) to WRITE($pack_style_object_header) SHA1("$type $size\0") WRITE(DEFLATE(SHA1($data))) then the SHA-1 result is the same but we get the pack-style header, and blobs can be sucked straight into packs when not deltified. The SHA-1 result is still usable at the end to rename the temporary loose object file (and put it in the correct xx subdirectory). Because we can't change the SHA-1 result we unfortunately can never drop the 2nd call above [this is something that could have been different, to respond to the email that started this thread]. You didn't like the duplication between the 1st and 2nd call, but I can't say I see that as a big deal. > > It also makes it very hard to stream into a packfile if you have > > determined its not worth creating a delta for the object (or no > > suitable delta base is available). > > > > The new (now deprecated) loose object format that was based on > > the packfile header format simplified this and made it much > > easier to work with. > > Not really. Since separate zlib compression levels for loose objects > and packed objects were introduced, there was a bunch of correctness > issues. What do you do when both compression levels are different? > Sometimes ignore them, sometimes not? Because the default loose object > compression level is about speed and the default pack compression level > is about good space reduction, the correct thing to do by default would > have been to always decompress and recompress anyway when copying an > otherwise unmodified loose object into a pack. Not exactly. I did think about this. When you are packing to stdout, and only sending the resulting packfile locally, you don't want to bother with recompressing everything. [This is the "workgroup" case that concerns me.] Other cases, sure, recompression could help (e.g., packing to a file means the file will probably be around for a while, so you want to recompress if the levels are unequal; and you probably want to recompress as well if the packfile will be sent over a "slow" link). Thanks, -- Dana L. How danahow@xxxxxxxxx +1 650 804 5991 cell - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html