On 4/18/2023 6:39 AM, Jeff King wrote: > On Mon, Apr 17, 2023 at 07:03:08PM -0400, Taylor Blau wrote: I agree with the prior discussion that gc.bigPackThreshold is currently misbehaving and stopping it from caring about cruft packs is the best way to fix that behavior in this series. >> It is possible that in the future we could support writing multiple >> cruft packs (we already handle the presence of multiple cruft packs >> fine, just don't expose an easy way for the user to write >1 of them). >> And at that point we would be able to relax this patch a bit and allow >> `gc.bigPackThreshold` to cover cruft packs, too. But in the meantime, >> the benefit of avoiding loose object explosions outweighs the possible >> drawbacks here, IMHO. > > I wondered if that interface might be an option to say "hey, I have a > gigantic cruft file I want to carry forward, please leave it alone". > > But if you have a giant cruft pack that is making your "git gc" too > slow, it will eventually age out on its own. And if you're impatient, > then "git gc --prune=now" is probably the right tool. > > And If you really did want to keep rolling it forward for some reason, > then I'd think marking it with ".keep" would be the best thing (and > maybe even dropping the mtimes file? I'm not sure a how a kept-cruft > pack does or should behave). Generally, it's probably a good idea to (later) create a separate knob for "don't rewrite the objects in a 'big' cruft pack unless you need to". For situations where cruft objects are being collected and not regularly pruned, this helps avoid repacking all unreachable objects into a giant single pack, even though only a small number of objects were discovered unreachable this time. The important times where we'd want to consider a 'big' cruft pack for rewrite are: 1. Some objects in the cruft pack are being pruned. 2. Some objects in the cruft pack need updated mtimes. However, in the typical case that we are adding new cruft objects and not changing the mtimes of existing unreachable objects, we could create a sensible limit on the size of a cruft pack to be rewritten during normal maintenance. My personal preference would be something between 2GB and 10GB, which seems like a decent balance between "size of cruft pack" and "number of cruft packs" for most repositories. Since none of the objects are reachable, we don't really care about them having good deltas for things like fetches and clones. The benefit of reducing the time spent in 'git repack --cruft' outweighs the slight disk space savings by having a single cruft pack, in my opinion. Thanks, -Stolee