On Wed, Jul 14, 2021 at 09:32:26PM +0200, Ævar Arnfjörð Bjarmason wrote: > > But if that isn't possible, then I find introducing a new file to > > redefine the pack's mtime just to accommodate a backup system that > > doesn't know better to be a poor justification for adding this > > complexity. Especially since we agree that rsync-ing live Git > > repositories is a bad idea in the first place ;). > > > > If it were me, I would probably stop here and avoid pursuing this > > further. But an OK middle ground might be core.freshenPackfiles=<bool> > > to indicate whether or not packs can be freshened, or the objects > > contained within them should just be rewritten loose. > > > > Sun could then set this configuration to "false", implying: > > > > - That they would have more random loose objects, leading to some > > redundant work by their backup system. > > - But they wouldn't have to resync their huge packfiles. > > > > ...and we wouldn't have to introduce any new formats/file types to do > > it. To me, that seems like a net-positive outcome. > > This approach is getting quite close to my core.checkCollisions patch, > to the point of perhaps being indistinguishable in practice: > https://lore.kernel.org/git/20181028225023.26427-5-avarab@xxxxxxxxx/ Hmm, I'm not sure if I understand. That collision check is only done during index-pack, and reading builtin/index-pack.c:check_collision(), it looks like we only do it for large blobs anyway. > I.e. if you're happy to re-write out duplicate objects then you're going > to be ignoring the collision check and don't need to do it. It's not the > same in that you might skip writing objects you know are reachable, and > with the collisions check off and not-so-thin packs you will/might get > more redundancy than you asked for. We may be talking about different things, but if users are concerned about SHA-1 collisions, then they should still be able to build with DC_SHA1=YesPlease to catch shattered-style collisions. Anyway, I think we may be a little in the weeds for what we are trying to accomplish here. I'm thinking something along the lines of the following (sans documentation and tests, of course ;)). --- >8 --- diff --git a/object-file.c b/object-file.c index f233b440b2..87c9238365 100644 --- a/object-file.c +++ b/object-file.c @@ -1971,9 +1971,22 @@ static int freshen_loose_object(const struct object_id *oid) return check_and_freshen(oid, 1); } +static int can_freshen_packs = -1; +static int get_can_freshen_packs(void) +{ + if (can_freshen_packs < 0) { + if (git_config_get_bool("core.freshenpackfiles", + &can_freshen_packs)) + can_freshen_packs = 1; + } + return can_freshen_packs; +} + static int freshen_packed_object(const struct object_id *oid) { struct pack_entry e; + if (!get_can_freshen_packs()) + return 0; if (!find_pack_entry(the_repository, oid, &e)) return 0; if (e.p->freshened)