Re: [PATCH 6/6] fetch: make `--atomic` flag cover pruning of refs

Patrick Steinhardt <ps@xxxxxx> · Thu, 17 Feb 2022 13:03:03 +0100

On Tue, Feb 15, 2022 at 10:12:23AM +0100, Christian Couder wrote:
> On Fri, Feb 11, 2022 at 9:25 AM Patrick Steinhardt <ps@xxxxxx> wrote:
> >
> > When fetching with the `--prune` flag we will delete any local
> > references matching the fetch refspec which have disappeared on the
> > remote. This step is not currently covered by the `--atomic` flag: we
> > delete branches even though updating of local references has failed,
> > which means that the fetch is not an all-or-nothing operation.
> 
> It could perhaps be seen as a regression by some users though, if
> updating of local references doesn't work anymore when deleting a
> local reference matching the fetch refspec fails.

I guess the same comment applies here as for the other patch: the
documentation says that we either update all or no refs, and that's not
what was happening previous to this patch.

> > Fix this bug by passing in the global transaction into `prune_refs()`:
> > if one is given, then we'll only queue up deletions and not commit them
> > right away.
> >
> > This change also improves performance when pruning many branches in a
> > repository with a big packed-refs file: every references is pruned in
> > its own transaction, which means that we potentially have to rewrite
> > the packed-refs files for every single reference we're about to prune.
> 
> Yeah, I wonder if there could be a performance improvement in the
> previous patch too as it looks like tag backfilling wasn't atomic too.

I doubt it would be as measurable as it is here. The reason why we have
this speedup is that for every deleted ref, we need to rewrite the
complete contents of the packed-refs file only with that single ref
removed from it. So for 10k refs, we essentially write the file 10k
times.

For the backfilling case that doesn't happen though: we only write the
new loose refs, and that's not any faster in case we use a single
transaction. Sure, we'll likely be able to shed some of the overhead by
using a single transaction only, but it will not be as pronounced as it
is here.

This will be different though as soon as the reftable backend lands:
there we'd write all new refs in a single slice, and that's definitely
more efficient than writing one slice per backfilled ref.

Patrick

> > The following benchmark demonstrates this: it performs a pruning fetch
> > from a repository with a single reference into a repository with 100k
> > references, which causes us to prune all but one reference. This is of
> > course a very artificial setup, but serves to demonstrate the impact of
> > only having to write the packed-refs file once:
> >
> >     Benchmark 1: git fetch --prune --atomic +refs/*:refs/* (HEAD~)
> >       Time (mean ± σ):      2.366 s ±  0.021 s    [User: 0.858 s, System: 1.508 s]
> >       Range (min … max):    2.328 s …  2.407 s    10 runs
> >
> >     Benchmark 2: git fetch --prune --atomic +refs/*:refs/* (HEAD)
> >       Time (mean ± σ):      1.369 s ±  0.017 s    [User: 0.715 s, System: 0.641 s]
> >       Range (min … max):    1.346 s …  1.400 s    10 runs
> >
> >     Summary
> >       'git fetch --prune --atomic +refs/*:refs/* (HEAD)' ran
> >         1.73 ± 0.03 times faster than 'git fetch --prune --atomic +refs/*:refs/* (HEAD~)'
> 
> Nice!
Attachment:
signature.asc

Description: PGP signature