Re: [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 17, 2023 at 03:54:35PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@xxxxxxxxxxxx> writes:
>
> >   - The same is true for `gc.bigPackThreshold`, if the size of the cruft
> >     pack exceeds the limit set by the caller.
>
> This is not as cut-and-dried clear as the previous one.  "This pack
> is so large that it is not worth rewriting it only to expunge a
> handful of objects that are no longer reachable from it" is the main
> motivation to use this configuration, but doesn't some part of the
> same reasoning apply equally to a large cruft pack?  But let's
> assume that the configuration is totally irrelevant to cruft packs
> and read on.

This is an inherent design trade-off. I imagine that callers who want to
avoid rewriting their (large) cruft packs would prefer to generate a new
cruft pack on top with just the recently accumulated unreachable
objects.

That kind of works, except if you need to prune objects that are packed
in an earlier cruft pack. If you have `gc.bigPackThreshold`, there is no
way to do this if you need to expire objects that are in cruft packs
above that threshold.

A user may find themselves frustrated when trying to `git gc --prune`
some sensitive object(s) from their repository doesn't appear to work,
only to discover that `gc.bigPackThreshold` is set somewhere in their
configuration.

Writing (largely) the same cruft pack to expunge a few objects isn't
ideal, but it is better than the status quo. And if you have so many
unreachable objects that this is a concern, it is probably time to prune
anyway.

It is possible that in the future we could support writing multiple
cruft packs (we already handle the presence of multiple cruft packs
fine, just don't expose an easy way for the user to write >1 of them).
And at that point we would be able to relax this patch a bit and allow
`gc.bigPackThreshold` to cover cruft packs, too. But in the meantime,
the benefit of avoiding loose object explosions outweighs the possible
drawbacks here, IMHO.

> >  --keep-largest-pack::
> > -	All packs except the largest pack and those marked with a
> > -	`.keep` files are consolidated into a single pack. When this
> > -	option is used, `gc.bigPackThreshold` is ignored.
> > +	All packs except the largest pack, any packs marked with a
> > +	`.keep` file, and any cruft pack(s) are consolidated into a
> > +	single pack. When this option is used, `gc.bigPackThreshold` is
> > +	ignored.
>
> "except the largest pack" -> "except the largest, non-cruft pack"

Indeed, good eyes.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux