Re: [PATCH 1/2] repack: respect kept objects with '--write-midx -b'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 17, 2021 at 04:28:45PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> 
> Historically, we needed a single packfile in order to have reachability
> bitmaps. This introduced logic that when 'git repack' had a '-b' option
> that we should stop sending the '--honor-pack-keep' option to the 'git
> pack-objects' child process, ensuring that we create a packfile
> containing all reachable objects.
> 
> In the world of multi-pack-index bitmaps, we no longer need to repack
> all objects into a single pack to have valid bitmaps. Thus, we should
> continue sending the '--honor-pack-keep' flag to 'git pack-objects'.
> 
> The fix is very simple: only disable the flag when writing bitmaps but
> also _not_ writing the multi-pack-index.
> 
> This opens the door to new repacking strategies that might want to keep
> some historical set of objects in a stable pack-file while only
> repacking more recent objects.

That all makes sense. Another way of thinking about it: we disable
--honor-pack-keep so we that keep packs do not interfere with writing a
pack bitmap. But with --write-midx, we skip the pack bitmap entirely.

In the end it's the same, but I wanted to emphasize that the important
hing is not so much that we are writing a midx bitmap as that we are
_not_ writing a pack bitmap. And that is what makes this OK to do (that
the repack code already disables the pack bitmap when writing a midx
one).

> diff --git a/builtin/repack.c b/builtin/repack.c
> index 9b0be6a6ab3..1f128b7c90b 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -693,7 +693,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  		write_bitmaps = 0;
>  	}
>  	if (pack_kept_objects < 0)
> -		pack_kept_objects = write_bitmaps > 0;
> +		pack_kept_objects = write_bitmaps > 0 && !write_midx;

So the code change here looks good.

> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 0260ad6f0e0..8c4ba6500be 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -372,4 +372,19 @@ test_expect_success '--write-midx with preferred bitmap tips' '
>  	)
>  '
>  
> +test_expect_success '--write-midx -b packs non-kept objects' '
> +	git init midx-kept &&
> +	test_when_finished "rm -fr midx-kept" &&
> +	(
> +		cd midx-kept &&
> +		test_commit_bulk 100 &&
> +		GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> +			git repack --write-midx -a -b &&
> +		cat trace.txt | \
> +			grep \"event\":\"start\" | \
> +			grep pack-objects | \
> +			grep \"--honor-pack-keep\"
> +	)
> +'

This looks correct, though:

  - do we really need this separate repo directory? The test before it
    uses one, but only because it needs a very specific set of commits.
    There is a long-running "midx" directory we could use, though I
    think just the regular test repo would be fine, too.

  - likewise, do we need 100 commits? They are not too expensive these
    days with test_commit_bulk, but I think we don't even care about the
    repo contents at all.

  - there is no actual .keep file in your test. That's OK, as we are
    just checking the args passed to pack-objects.

  - useless use of cat. :) Also, you could probably do it all with one
    grep. This is bikeshedding, of course, but it's nice to keep process
    counts low in the test suite. Also, your middle grep is looser than
    the others (it doesn't check for surrounding quotes).

So something like this would work:

  test_expect_success '--write-midx -b packs non-kept objects' '
          GIT_TRACE2_EVENT="$(pwd)/midx-keep-bitmap.trace" \
                  git -C midx repack --write-midx -a -b &&
          grep "\"event\":\"start\".*\"pack-objects\".*\"--honor-pack-keep\"" \
                  midx-keep-bitmap.trace
  '

One could perhaps argue that the combined grep is less readable. If
that's a concern, I'd suggest wrapping it in a function like:

  # usage: check_trace2_arg <trace_file> <cmd> <arg>
  check_trace2_arg () {
	grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1"
  }

All just suggestions, of course. I'd be happy enough to see the patch go
in as-is.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux