Re: [PATCH v2] Custom compression levels for objects and packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 9 May 2007, Dana How wrote:

> OK,  I got really confused here, so I looked over the code,
> and figured out 2 causes for my confusion.
> (1) core.legacyheaders controls use_legacy_headers, which defaults to 1.
> So currently all loose objects are in legacy format and the code block
> I spoke of doesn't trigger [without a config setting].  I didn't realize
> legacy headers were still being produced (mislead by the name!).
> (2) I read your "setting core.legacyheaders" as followed by TRUE,
> but you meant FALSE, which is not the default.
> 
> I also read that 1 year after 1.4.2, the default for core.legacyheaders is
> going
> to change to FALSE.  I think our discussion should assume this has
> happened.

<tangential comment>

Now that we encourage and actively preserve objects in a packed form 
more agressively than we did at the time core.legacyheaders was 
introduced, I really wonder if this option is still worth it.  Because 
the packing of loose objects has to go through the delta match loop 
anyway, and since most of them should end up being deltified in most 
cases, there is really little advantage to have this parallel loose 
object format as the CPU savings it might provide is rather lost in the 
noise in the end.  

So I suggest that we get rid of core.legacyheaders, preserve the legacy 
format as the only writable loose object format and deprecate the other 
one to keep things simpler.  Thoughts?

</tangential comment>

> So let's assume FALSE in the following.  The point of that is that 
> such a FALSE setting can't be assumed to have any special intent; it 
> will be the default.
> 
> [Everything I write here boils down to only one question,
> which I repeat at the end.]
> 
> Data gets into a pack in these ways:
> 1. Loose object copied in;
> 2. Loose object newly deltified;
> 3. Packed object to be copied;
> 4. Packed object to be newly deltified;
> 5. Packed deltified object we can't re-use;
> 6. Packed deltified object we can re-use.
> ["copied" includes recompressed.]

I think you forgets "packed undeltified objects we can reuse".

> In (2), (4), and (5), pack.compression will always be newly used.
> If pack.compression doesn't change,  this means (6)
> will be using pack.compression since it comes from (2) or (4).
> So if I "guarantee" that (1) uses pack.compression,
> (3) will as well, meaning everything in the pack will be
> at pack.compression.
> 
> Thus if pack.compression != core.loosecompression takes precedence
> over core.legacyheaders = false,  then for pack.compression constant
> we get all 6 cases at level pack.compression.  If core.legacyheaders =
> false takes precedence as you suggest,  then all undeltified objects
> (20%?) will be stuck at core.loosecompression [since I see no way
> to sensibly re-apply compression to something copied pack-to-pack].
> 
> I think this is inconsistent with what a pack.compression !=
> core.loosecompression setting is telling us.

OK I see that I missed the fact that git-repack -f (or git-pack-objects 
--no-reuse-delta) does not recompress undeltified objects.  Note this is 
a problem in the case where you change pack.compression to a different 
value or override it on the command line as well: reused undeltified 
objects won't get recompressed with the new level.  My rant on 
core.legacyheaders and its removal would address the first case.  Your 
test for a difference between loose and packed compression levels is 
flawed because the value of core.compression does not necessarily 
represent the compression level that was used for the loose objects to 
pack (core.compression might have been modified since then), but it 
only addresses the first case too.  And this is a problem even now.

What we need instead is a --no-reuse-object that would force 
recompression of everything when you really want to enforce a specific 
compression level across the whole pack(s).


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux