Patrick Steinhardt <ps@xxxxxx> writes: > The `--auto` flag for git-pack-refs(1) allows the ref backend to decide > whether or not a repack is in order. This switch has been introduced > mostly with the "reftable" backend in mind, which already knows to > auto-compact its tables during normal operations. When the flag is set, > then it will use the same auto-compaction mechanism and thus end up > doing nothing in most cases. > > The "files" backend does not have any such heuristic yet and instead Nit: s/instead/will instead/ > packs any loose references unconditionally. So we rewrite the complete > "packed-refs" file even if there's only a single loose reference to be > packed. > > Even worse, starting with 9f6714ab3e (builtin/gc: pack refs when using > `git maintenance run --auto`, 2024-03-25), `git pack-refs --auto` is > unconditionally executed via our auto maintenance, so we end up repacking > references every single time auto maintenance kicks in. And while that > commit already mentioned that the "files" backend unconditionally packs > refs now, the author obviously didn't quite think about the consequences > thereof. So while the idea was sound, we really should have added a > heuristic to the "files" backend before implementing it. > > Introduce a heuristic that decides whether or not it is worth to pack > loose references. The important factors to decide here are the number of > loose references in comparison to the overall size of the "packed-refs" > file. The bigger the "packed-refs" file, the longer it takes to rewrite > it and thus we scale up the limit of allowed loose references before we > repack. > > As is the nature of heuristics, this mechansim isn't obviously > "correct", but should rather be seen as a tradeoff between how much > resources we spend packing refs and how inefficient the ref store > becomes. For all I can say, we have successfully been using the exact > same heuristic in Gitaly for several years by now. > > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > refs/files-backend.c | 75 +++++++++++++++++++++++++++++++ > refs/packed-backend.c | 18 ++++++++ > refs/packed-backend.h | 7 +++ > t/t0601-reffiles-pack-refs.sh | 85 ++++++++++++++++++++++++++++++----- > 4 files changed, 175 insertions(+), 10 deletions(-) > > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 1cff65f6ae5..261e2b8570f 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -1311,6 +1311,78 @@ static int should_pack_ref(struct files_ref_store *refs, > return 0; > } > > +static size_t fastlog2(size_t sz) Nit: We already `reftable/stack_test.c:fastlog2` and `bisect.c:log2i`. I wonder if it would be nice to consolidate all of them. But I guess it's okay, since the reftable code anyways is isolated from the rest of the Git code. > +{ > + size_t l = 0; > + if (!sz) > + return 0; > + for (; sz; sz /= 2) > + l++; > + return l - 1; > +} [snip] The rest of the patch looks great!
Attachment:
signature.asc
Description: PGP signature