On Fri, Jan 29, 2021 at 05:53:58PM -0500, Taylor Blau wrote: > > I'm still thinking aloud here, and not really sure which is a better > > path. I do feel like the failure modes for the second one are less > > risky. > > The more I think about it, the more I feel that the second option is the > right approach. It seems like if you were naïvely implementing this from > scratch, that you'd pick the second one (i.e., have pack-objects > understand a new input mode, and then make a pack based on that). > > I am leery that we'd be able to get the first option "right" without > attaching some sort of marker to each pack, especially given how > difficult I think that this is to reason about precisely. I suppose you > could have a .closed file corresponding to each pack, or alternatively a > $objdir/pack/pack-geometry file which specifies the same thing, but both > of these feel overly restrictive. Yeah, I think my gut feeling matches yours. > Besides having to special case the loose objects, is there any downside > to doing the simpler thing here? The other downside I can think of is that you can't just run "git repack --geometric" every time, and eventually get a good result (or one that asymptotically approaches good ;) ). I.e., you now have two types of repacks: quick and dirty rollups, and "real" ones that do reachability. So you need some heuristics about how often you do one versus the other. I'm definitely OK with that outcome. And I think we could even bake those heuristics into a script or mode of repack (e.g., maybe "gc --auto" would trigger a bigger repack every N times or something). But that's what I came up with by brainstorming. :) -Peff