On Wed, Jul 26, 2023 at 1:07 AM Taylor Blau <me@xxxxxxxxxxxx> wrote: > > On Mon, Jul 24, 2023 at 10:59:07AM +0200, Christian Couder wrote: > > A previous commit has implemented `git repack --filter=<filter-spec>` to > > allow users to filter out some objects from the main pack and move them > > into a new different pack. > > > > Users might want to perform such a cleanup regularly at the same time as > > they perform other repacks and cleanups, so as part of `git gc`. > > > > Let's allow them to configure a <filter-spec> for that purpose using a > > new gc.repackFilter config option. > > Makes sense. > > > Now when `git gc` will perform a repack with a <filter-spec> configured > > through this option and not empty, the repack process will be passed a > > corresponding `--filter=<filter-spec>` argument. > > I may be missing something, but what happens if the user has configured > gc.repackFilter, but passes additional filters over the command-line > arguments? I'm not sure whether these should be AND'd with the existing > filters in config, or if they should reset them to zero, or something > else. `git gc` doesn't recognize `--filter=<...>` arguments, only `git repack` is being teached to recognize it in this patch series. So I don't see how there could be multiple such arguments on the command line when `git gc` is used. Also in version 4 `git repack` can be passed many such arguments anyway. So I think we are good. We could support multiple gc.repackFilter config options, but on the other hand using something like `combine:<filter1>+<filter2>+...<filterN>` should work, as the content of the option is passed as-is to the command line. So we can leave that improvement for later if people don't like the `combine:...` and are interested in it. > Regardless, I think it would be beneficial to users if we spelled this > out in git-gc(1) instead of just this patch message here. I am not sure what should be spelled out. I think we refer people to the `repack --filter=...` option which in turn refers to the `rev-list --filter=...` which contains a good amount of documentation about how `--filter=...` works, including the fact that `combine:...` can be used and that multiple `--filter=...` options can be passed. > > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > > index 69509d0c11..5b89faf505 100755 > > --- a/t/t6500-gc.sh > > +++ b/t/t6500-gc.sh > > @@ -202,6 +202,18 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e > > grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out > > ' > > > > +test_expect_success 'gc.repackFilter launches repack with a filter' ' > > + test_when_finished "rm -rf bare.git" && > > + git clone --no-local --bare . bare.git && > > + > > + git -C bare.git -c gc.cruftPacks=false gc && > > + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && > > + > > + GIT_TRACE=$(pwd)/trace.out git -C bare.git -c gc.repackFilter=blob:none -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && > > Nit: can we wrap this across multiple lines? Done in version 4. > > + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && > > + grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out > > +' > > I think the `test_subcommand` helper might work here, and it would allow > you to avoid writing a long grep invocation. Other tests related to gc.reflogExpire above use a grep invocation similar to this one, while `test_subcommand` isn't used in the test script, so I think the grep invocation makes the whole script a bit easier to understand. Thanks.