Hi, this is the third version of my patch series to fix geometric repacking with repositories that are connected to an alternate object database. Changes compared to v2: - I've simplified patch 1 to reset the preferred pack index instead of moving around some of the checks. This also causes us to print the warning for missing preferred packfiles again. - I've fixed the logic in patch 2 to find the preferred packfile to not return a packfile that would be rolled up in a geometric repack. - I've added an additional patch to split out preexisting tests for `--stdin-packs` into their own test file. - I've changed the unportable test added for geometric repacking with `-l` that used stat(1) to instead use our `test-tool chmtime` helper. - I've changed the logic that disables writing bitmaps in git-repack to cover more cases. It now always kicks in when doing a repack with `-l` that asks for bitmaps when connected to an alternate object directory. - In general, there's a bunch of small improvements left and right for the tests I'm adding. Thanks for all the feedback so far. Patrick Patrick Steinhardt (10): midx: fix segfault with no packs and invalid preferred pack repack: fix trying to use preferred pack in alternates repack: fix generating multi-pack-index with only non-local packs pack-objects: split out `--stdin-packs` tests into separate file pack-objects: fix error when packing same pack twice pack-objects: fix error when same packfile is included and excluded pack-objects: extend test coverage of `--stdin-packs` with alternates t/helper: allow chmtime to print verbosely without modifying mtime repack: honor `-l` when calculating pack geometry repack: disable writing bitmaps when doing a local repack builtin/pack-objects.c | 10 +- builtin/repack.c | 62 ++++++++- midx.c | 6 +- object-file.c | 6 + object-store.h | 1 + t/helper/test-chmtime.c | 2 +- t/t5300-pack-object.sh | 135 ------------------- t/t5319-multi-pack-index.sh | 12 ++ t/t5331-pack-objects-stdin.sh | 240 ++++++++++++++++++++++++++++++++++ t/t7700-repack.sh | 17 +++ t/t7703-repack-geometric.sh | 165 +++++++++++++++++++++++ 11 files changed, 505 insertions(+), 151 deletions(-) create mode 100755 t/t5331-pack-objects-stdin.sh Range-diff against v2: 1: 8c384aabde ! 1: dd8145bead midx: fix segfault with no packs and invalid preferred pack @@ Commit message will cause a segfault as we blindly index into the array of packfiles, which would be empty. - Fix this bug by exiting early in case we have determined that the MIDX - wouldn't have any packfiles to index. While the request itself does not - make much sense anyway, it is still preferable to exit gracefully than - to abort. + Fix this bug by resetting the preferred packfile index to `-1` before + searching for the preferred pack. This fixes the segfault as we already + check for whether the index is `> - 1`. If it is not, we simply don't + pick a preferred packfile at all. + Helped-by: Taylor Blau <me@xxxxxxxxxxxx> Signed-off-by: Patrick Steinhardt <ps@xxxxxx> ## midx.c ## @@ midx.c: static int write_midx_internal(const char *object_dir, - for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); - stop_progress(&ctx.progress); + } -+ if (!ctx.nr) { -+ error(_("no pack files to index.")); -+ result = 1; -+ goto cleanup; -+ } + if (preferred_pack_name) { +- int found = 0; ++ ctx.preferred_pack_idx = -1; + - if ((ctx.m && ctx.nr == ctx.m->num_packs) && - !(packs_to_include || packs_to_drop)) { - struct bitmap_index *bitmap_git; + for (i = 0; i < ctx.nr; i++) { + if (!cmp_idx_or_pack_name(preferred_pack_name, + ctx.info[i].pack_name)) { + ctx.preferred_pack_idx = i; +- found = 1; + break; + } + } + +- if (!found) ++ if (ctx.preferred_pack_idx == -1) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + } else if (ctx.nr && ## t/t5319-multi-pack-index.sh ## @@ t/t5319-multi-pack-index.sh: test_expect_success 'write midx with --stdin-packs' ' @@ t/t5319-multi-pack-index.sh: test_expect_success 'write midx with --stdin-packs' + test_must_fail git -C empty multi-pack-index write \ + --stdin-packs --preferred-pack=does-not-exist </dev/null 2>err && + cat >expect <<-EOF && ++ warning: unknown preferred pack: ${SQ}does-not-exist${SQ} + error: no pack files to index. + EOF + test_cmp expect err 2: 5a6c2ead87 ! 2: c5db1bc587 repack: fix trying to use preferred pack in alternates @@ Commit message While at it, rename the function `get_largest_active_packfile()` to `get_preferred_pack()` to better document its intent. + Helped-by: Taylor Blau <me@xxxxxxxxxxxx> Signed-off-by: Patrick Steinhardt <ps@xxxxxx> ## builtin/repack.c ## @@ builtin/repack.c: static struct packed_git *get_largest_active_pack(struct pack_ return NULL; - return geometry->pack[geometry->pack_nr - 1]; + -+ for (i = geometry->pack_nr; i > 0; i--) ++ /* ++ * The preferred pack is the largest pack above the split line. In ++ * other words, it is the largest pack that does not get rolled up in ++ * the geometric repack. ++ */ ++ for (i = geometry->pack_nr; i > geometry->split; i--) + /* + * A pack that is not local would never be included in a + * multi-pack index. We thus skip over any non-local packs. @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric with pack.packSize + test_must_be_empty err && + + # We should see that a new packfile was generated. -+ find shared/.git/objects/pack -type f -name "*.pack" | sort >packs && -+ test $(wc -l <packs) = 1 && ++ find shared/.git/objects/pack -type f -name "*.pack" >packs && ++ test_line_count = 1 packs && + + # We should also see a multi-pack-index. This multi-pack-index should + # never refer to any packfiles in the alternate object database. -+ # Consequentially, it should be valid even with the alternate ODB -+ # deleted. -+ rm -r shared && -+ git -C member multi-pack-index verify ++ test -f member/.git/objects/pack/multi-pack-index && ++ test-tool read-midx member/.git/objects >packs.midx && ++ grep "^pack-.*\.idx$" packs.midx | sort >actual && ++ basename member/.git/objects/pack/pack-*.idx >expect && ++ test_cmp expect actual +' + test_done 3: f705cd6c49 ! 3: 8c3193268f repack: fix generating multi-pack-index with only non-local packs @@ Commit message Fix the code to skip non-local packfiles. Co-authored-by: Taylor Blau <me@xxxxxxxxxxxx> + Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> Signed-off-by: Patrick Steinhardt <ps@xxxxxx> ## builtin/repack.c ## @@ builtin/repack.c: static void midx_included_packs(struct string_list *include, ## t/t7703-repack-geometric.sh ## @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric --write-midx with packfiles in main and alterna - git -C member multi-pack-index verify + test_cmp expect actual ' +test_expect_success '--geometric --with-midx with no local objects' ' @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric --write-midx with + # alternate object database does not invalidate the geometric sequence. + # And we should not have a multi-pack-index because these only index + # local packfiles, and there are none. -+ find member/.git/objects/pack -type f >actual && -+ test_must_be_empty actual ++ test_dir_is_empty member/$packdir +' + test_done -: ---------- > 4: 8d47d753dc pack-objects: split out `--stdin-packs` tests into separate file 4: ff28829589 ! 5: ac528514e7 pack-objects: fix error when packing same pack twice @@ builtin/pack-objects.c: static void read_packs_list_from_stdin(void) for (p = get_all_packs(the_repository); p; p = p->next) { const char *pack_name = pack_basename(p); - ## t/t5331-pack-objects-stdin.sh (new) ## -@@ -+#!/bin/sh -+ -+test_description='pack-objects --stdin' -+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main -+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME -+ -+TEST_PASSES_SANITIZE_LEAK=true -+. ./test-lib.sh + ## t/t5331-pack-objects-stdin.sh ## +@@ t/t5331-pack-objects-stdin.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME + TEST_PASSES_SANITIZE_LEAK=true + . ./test-lib.sh + ++packed_objects() { ++ git show-index <"$1" >tmp-object-list && ++ cut -d' ' -f2 tmp-object-list | sort && ++ rm tmp-object-list ++ } + + test_expect_success 'setup for --stdin-packs tests' ' + git init stdin-packs && + ( +@@ t/t5331-pack-objects-stdin.sh: test_expect_success '--stdin-packs with broken links' ' + ) + ' + +test_expect_success 'pack-objects --stdin with duplicate packfile' ' + test_when_finished "rm -fr repo" && + @@ t/t5331-pack-objects-stdin.sh (new) + test_commit "commit" && + git repack -ad && + -+ ( ++ { + basename .git/objects/pack/pack-*.pack && + basename .git/objects/pack/pack-*.pack -+ ) >packfiles && ++ } >packfiles && + + git pack-objects --stdin-packs generated-pack <packfiles && -+ test_cmp generated-pack-*.pack .git/objects/pack/pack-*.pack ++ packed_objects .git/objects/pack/pack-*.idx >expect && ++ packed_objects generated-pack-*.idx >actual && ++ test_cmp expect actual + ) +' + -+test_done + test_done ## t/t7703-repack-geometric.sh ## @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric --with-midx with no local objects' ' - test_must_be_empty actual + test_dir_is_empty member/$packdir ' +test_expect_success '--geometric with same pack in main and alternate ODB' ' 5: 8011603d6d ! 6: afd7c7864d pack-objects: fix error when same packfile is included and excluded @@ builtin/pack-objects.c: static void read_packs_list_from_stdin(void) ## t/t5331-pack-objects-stdin.sh ## -@@ t/t5331-pack-objects-stdin.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME - TEST_PASSES_SANITIZE_LEAK=true - . ./test-lib.sh - -+packed_objects() { -+ git show-index <"$1" >tmp-object-list && -+ cut -d' ' -f2 tmp-object-list && -+ rm tmp-object-list -+ } -+ - test_expect_success 'pack-objects --stdin with duplicate packfile' ' - test_when_finished "rm -fr repo" && - @@ t/t5331-pack-objects-stdin.sh: test_expect_success 'pack-objects --stdin with duplicate packfile' ' ) ' @@ t/t5331-pack-objects-stdin.sh: test_expect_success 'pack-objects --stdin with du + test_commit "commit" && + git repack -ad && + -+ ( ++ { + basename .git/objects/pack/pack-*.pack && + printf "^%s\n" "$(basename .git/objects/pack/pack-*.pack)" -+ ) >packfiles && ++ } >packfiles && + + git pack-objects --stdin-packs generated-pack <packfiles && + packed_objects generated-pack-*.idx >packed-objects && 6: 3e958bf2a4 ! 7: cd135439ae pack-objects: extend test coverage of `--stdin-packs` with alternates @@ t/t5331-pack-objects-stdin.sh: test_expect_success 'pack-objects --stdin with sa + test_commit -C member "local-commit" && + git -C member repack -dl && + -+ ( -+ basename shared/.git/objects/pack/pack-*.pack && ++ { ++ basename shared/.git/objects/pack/pack-*.pack && + basename member/.git/objects/pack/pack-*.pack -+ ) >packfiles && ++ } >packfiles && + -+ ( ++ { + packed_objects shared/.git/objects/pack/pack-*.idx && + packed_objects member/.git/objects/pack/pack-*.idx -+ ) | sort >expected-objects && ++ } | sort >expected-objects && + + git -C member pack-objects --stdin-packs generated-pack <packfiles && -+ packed_objects member/generated-pack-*.idx | sort >actual-objects && ++ packed_objects member/generated-pack-*.idx >actual-objects && + test_cmp expected-objects actual-objects +' + -: ---------- > 8: 18bac90289 t/helper: allow chmtime to print verbosely without modifying mtime 7: f014872ed4 ! 9: 285695deaf repack: honor `-l` when calculating pack geometry @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric with same pack in + # packfile and thus arrive at the conclusion that the geometric + # sequence is intact. We thus expect no changes. + # -+ # Note that we are using stat(1) to verify idempotence to also verify -+ # that the mtime did not change. This is done in order to detect the -+ # case where we do repack objects, but the resulting packfile is the -+ # same. -+ stat member/.git/objects/pack/* >expected-member-packs && ++ # Note that we are tweaking mtimes of the packfiles so that we can ++ # verify they did not change. This is done in order to detect the case ++ # where we do repack objects, but the resulting packfile is the same. ++ test-tool chmtime --verbose =0 member/.git/objects/pack/* >expected-member-packs && + git -C member repack --geometric=2 -l -d && -+ stat member/.git/objects/pack/* >actual-member-packs && ++ test-tool chmtime --verbose member/.git/objects/pack/* >actual-member-packs && + test_cmp expected-member-packs actual-member-packs && + -+ ( ++ { + packed_objects shared/.git/objects/pack/pack-*.idx && + packed_objects member/.git/objects/pack/pack-*.idx -+ ) | sort >expected-objects && ++ } | sort >expected-objects && + + # On the other hand, when doing a non-local geometric repack we should + # see both packfiles and thus repack them. We expect that the shared + # object database was not changed. -+ stat shared/.git/objects/pack/* >expected-shared-packs && ++ test-tool chmtime --verbose =0 shared/.git/objects/pack/* >expected-shared-packs && + git -C member repack --geometric=2 -d && -+ stat shared/.git/objects/pack/* >actual-shared-packs && ++ test-tool chmtime --verbose shared/.git/objects/pack/* >actual-shared-packs && + test_cmp expected-shared-packs actual-shared-packs && + + # Furthermore, we expect that the member repository now has a single 8: b10ee7fc3d ! 10: bb0ee0eb3c repack: disable writing bitmaps when doing a local geometric repack @@ Metadata Author: Patrick Steinhardt <ps@xxxxxx> ## Commit message ## - repack: disable writing bitmaps when doing a local geometric repack + repack: disable writing bitmaps when doing a local repack In order to write a bitmap, we need to have full coverage of all objects that are about to be packed. In the traditional non-multi-pack-index @@ Commit message with writing bitmaps when we have multiple packfiles as long as the multi-pack-index covers all objects. - This is not always the case though. When writing multi-pack-indices in a - repository that is connected to an alternate object directory we may end - up writing a multi-pack-index that only has partial coverage of objects. - The end result is that writing the bitmap will fail: + This is not always the case though. When asked to perform a repack of + local objects, only, then we cannot guarantee to have full coverage of + all objects regardless of whether we do a full repack or a repack with a + multi-pack-index. The end result is that writing the bitmap will fail in + both worlds: $ git multi-pack-index write --stdin-packs --bitmap <packfiles warning: Failed to write bitmap index. Packfile doesn't have full closure (object 1529341d78cf45377407369acb0f4ff2b5cdae42 is missing) @@ Commit message Now there are two different ways to fix this. The first one would be to amend git-multi-pack-index(1) to disable writing bitmaps when we notice - that we don't have full object coverage. But we ain't really got enough - information there, and seeing that it is a low-level plumbing command it - does not feel like the right place to fix this. + that we don't have full object coverage. - We can easily fix it at a higher level in git-repack(1) though. When all - of the following conditions are true: + - We don't have enough information in git-multi-pack-index(1) in + order to tell whether the local repository _should_ have full + coverage. Because even when connected to an alternate object + directory, it may be the case that we still have all objects + around in the main object database. - - We are asked to write a multi-pack index with bitmaps. + - git-multi-pack-index(1) is quite a low-level tool. Automatically + disabling functionality that it was asked to provide does not feel + like the right thing to do. - - We are asked to only include local objects via `-l`. - - - We are connected to an alternate repository that has packfiles. - - Then we will override the user's ask and disable writing bitmaps with a - warning. This is similar to what we do in git-pack-objects(1), where we - also disable writing bitmaps in case we omit an object from the pack. + We can easily fix it at a higher level in git-repack(1) though. When + asked to only include local objects via `-l` and when connected to an + alternate object directory then we will override the user's ask and + disable writing bitmaps with a warning. This is similar to what we do in + git-pack-objects(1), where we also disable writing bitmaps in case we + omit an object from the pack. Signed-off-by: Patrick Steinhardt <ps@xxxxxx> ## builtin/repack.c ## @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix) - if (pack_kept_objects < 0) - pack_kept_objects = write_bitmaps > 0 && !write_midx; - -+ if (write_midx && write_bitmaps && geometric_factor && po_args.local) { -+ struct packed_git *p; -+ -+ for (p = get_all_packs(the_repository); p; p = p->next) { -+ if (p->pack_local) -+ continue; -+ -+ /* -+ * When asked to do a local repack, but we have -+ * packfiles that are inherited from an alternate, then -+ * we cannot guarantee that the multi-pack-index would -+ * have full coverage of all objects. We thus disable -+ * writing bitmaps in that case. -+ */ -+ warning(_("disabling bitmap writing, as some objects are not being packed")); -+ write_bitmaps = 0; -+ break; -+ } -+ } -+ if (write_bitmaps && !(pack_everything & ALL_INTO_ONE) && !write_midx) die(_(incremental_bitmap_conflict_error)); ++ if (write_bitmaps && po_args.local && has_alt_odb(the_repository)) { ++ /* ++ * When asked to do a local repack, but we have ++ * packfiles that are inherited from an alternate, then ++ * we cannot guarantee that the multi-pack-index would ++ * have full coverage of all objects. We thus disable ++ * writing bitmaps in that case. ++ */ ++ warning(_("disabling bitmap writing, as some objects are not being packed")); ++ write_bitmaps = 0; ++ } ++ + if (write_midx && write_bitmaps) { + struct strbuf path = STRBUF_INIT; + + + ## object-file.c ## +@@ object-file.c: void prepare_alt_odb(struct repository *r) + r->objects->loaded_alternates = 1; + } + ++int has_alt_odb(struct repository *r) ++{ ++ prepare_alt_odb(r); ++ return !!r->objects->odb->next; ++} ++ + /* Returns 1 if we have successfully freshened the file, 0 otherwise. */ + static int freshen_file(const char *fn) + { + + ## object-store.h ## +@@ object-store.h: KHASH_INIT(odb_path_map, const char * /* key: odb_path */, + struct object_directory *, 1, fspathhash, fspatheq) + + void prepare_alt_odb(struct repository *r); ++int has_alt_odb(struct repository *r); + char *compute_alternate_path(const char *path, struct strbuf *err); + struct object_directory *find_odb(struct repository *r, const char *obj_dir); + typedef int alt_odb_fn(struct object_directory *, void *); + + ## t/t7700-repack.sh ## +@@ t/t7700-repack.sh: test_expect_success SYMLINKS '--local keeps packs when alternate is objectdir ' + test_cmp expect actual + ' + ++test_expect_success '--local disables writing bitmaps when connected to alternate ODB' ' ++ test_when_finished "rm -rf shared member" && ++ ++ git init shared && ++ git clone --shared shared member && ++ ( ++ cd member && ++ test_commit "object" && ++ git repack -Adl --write-bitmap-index 2>err && ++ cat >expect <<-EOF && ++ warning: disabling bitmap writing, as some objects are not being packed ++ EOF ++ test_cmp expect err && ++ test ! -f .git/objects/pack-*.bitmap ++ ) ++' ++ + test_expect_success 'packed obs in alt ODB are repacked even when local repo is packless' ' + mkdir alt_objects/pack && + mv .git/objects/pack/* alt_objects/pack && ## t/t7703-repack-geometric.sh ## @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric -l with non-intact geometric sequence across OD -- 2.40.0
Attachment:
signature.asc
Description: PGP signature