Re: [PATCH v2] packfile: freshen the mtime of packfile by configuration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 14, 2021 at 09:32:26PM +0200, Ævar Arnfjörð Bjarmason wrote:
> > But if that isn't possible, then I find introducing a new file to
> > redefine the pack's mtime just to accommodate a backup system that
> > doesn't know better to be a poor justification for adding this
> > complexity. Especially since we agree that rsync-ing live Git
> > repositories is a bad idea in the first place ;).
> >
> > If it were me, I would probably stop here and avoid pursuing this
> > further. But an OK middle ground might be core.freshenPackfiles=<bool>
> > to indicate whether or not packs can be freshened, or the objects
> > contained within them should just be rewritten loose.
> >
> > Sun could then set this configuration to "false", implying:
> >
> >   - That they would have more random loose objects, leading to some
> >     redundant work by their backup system.
> >   - But they wouldn't have to resync their huge packfiles.
> >
> > ...and we wouldn't have to introduce any new formats/file types to do
> > it. To me, that seems like a net-positive outcome.
>
> This approach is getting quite close to my core.checkCollisions patch,
> to the point of perhaps being indistinguishable in practice:
> https://lore.kernel.org/git/20181028225023.26427-5-avarab@xxxxxxxxx/

Hmm, I'm not sure if I understand. That collision check is only done
during index-pack, and reading builtin/index-pack.c:check_collision(),
it looks like we only do it for large blobs anyway.

> I.e. if you're happy to re-write out duplicate objects then you're going
> to be ignoring the collision check and don't need to do it. It's not the
> same in that you might skip writing objects you know are reachable, and
> with the collisions check off and not-so-thin packs you will/might get
> more redundancy than you asked for.

We may be talking about different things, but if users are concerned
about SHA-1 collisions, then they should still be able to build with
DC_SHA1=YesPlease to catch shattered-style collisions.

Anyway, I think we may be a little in the weeds for what we are trying
to accomplish here. I'm thinking something along the lines of the
following (sans documentation and tests, of course ;)).

--- >8 ---

diff --git a/object-file.c b/object-file.c
index f233b440b2..87c9238365 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1971,9 +1971,22 @@ static int freshen_loose_object(const struct object_id *oid)
 	return check_and_freshen(oid, 1);
 }

+static int can_freshen_packs = -1;
+static int get_can_freshen_packs(void)
+{
+	 if (can_freshen_packs < 0) {
+		if (git_config_get_bool("core.freshenpackfiles",
+					&can_freshen_packs))
+			can_freshen_packs = 1;
+	 }
+	 return can_freshen_packs;
+}
+
 static int freshen_packed_object(const struct object_id *oid)
 {
 	struct pack_entry e;
+	if (!get_can_freshen_packs())
+		return 0;
 	if (!find_pack_entry(the_repository, oid, &e))
 		return 0;
 	if (e.p->freshened)



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux