While attempting to fix a reference negotiation bug in bundle-uri, we identified that the fetch process lacks some crucial object validation checks when processing bundles. The primary issues are: 1. In the bundle-uri scenario, object IDs were not validated before writing bundle references. This was the root cause of the original negotiation bug in bundle-uri and could lead to potential repository corruption. 2. The existing "fetch.fsckObjects" and "transfer.fsckObjects" configurations were not applied when directly fetching bundles or fetching with bundle-uri enabled. In fact, there were no object validation supports for unbundle. The first patch addresses the bundle-uri negotiation issue by removing the REF_SKIP_OID_VERIFICATION flag when writing bundle references. Patches 2 through 3 extend verify_bundle_flags for bundle.c:unbundle to add support for object validation (fsck) in fetch scenarios, mainly following the suggestions from Junio and Patrick on the mailing list. Xing Xin (3): bundle-uri: verify oid before writing refs fetch-pack: expose fsckObjects configuration logic unbundle: extend object verification for fetches bundle-uri.c | 6 +- bundle.c | 3 + bundle.h | 1 + fetch-pack.c | 17 ++-- fetch-pack.h | 5 + t/t5558-clone-bundle-uri.sh | 181 +++++++++++++++++++++++++++++++++++- t/t5607-clone-bundle.sh | 33 +++++++ transport.c | 3 +- 8 files changed, 235 insertions(+), 14 deletions(-) base-commit: b9cfe4845cb2562584837bc0101c0ab76490a239 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1730%2Fblanet%2Fxx%2Fbundle-uri-bug-using-bundle-list-v7 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1730/blanet/xx/bundle-uri-bug-using-bundle-list-v7 Pull-Request: https://github.com/gitgitgadget/git/pull/1730 Range-diff vs v6: 1: e958a3ab20c ! 1: fc9f44fda00 bundle-uri: verify oid before writing refs @@ Commit message be found for negotiation because it exists in "incr.pack", which is not included in `packed_git`. - This commit fixes the bug by removing `REF_SKIP_OID_VERIFICATION` flag - when writing bundle refs. When `refs.c:refs_update_ref` is called to to - write the corresponding bundle refs, it triggers - `refs.c:ref_transaction_commit`. This, in turn, invokes - `refs.c:ref_transaction_prepare`, which calls `transaction_prepare` of - the refs storage backend. For files backend, this function is - `files-backend.c:files_transaction_prepare`, and for reftable backend, - it is `reftable-backend.c:reftable_be_transaction_prepare`. Both - functions eventually call `object.c:parse_object`, which can invoke + Fix the bug by removing `REF_SKIP_OID_VERIFICATION` flag when writing + bundle refs. When `refs.c:refs_update_ref` is called to write the + corresponding bundle refs, it triggers `refs.c:ref_transaction_commit`. + This, in turn, invokes `refs.c:ref_transaction_prepare`, which calls + `transaction_prepare` of the refs storage backend. For files backend, it + is `files-backend.c:files_transaction_prepare`, and for reftable + backend, it is `reftable-backend.c:reftable_be_transaction_prepare`. + Both functions eventually call `object.c:parse_object`, which can invoke `packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures that bundle refs point to valid objects and that all tips from bundle refs are correctly parsed during subsequent negotiations. - A test has been added to demonstrate that bundles with incorrect - headers, where refs point to non-existent objects, do not result in any - bundle refs being created in the repository. Additionally, a set of - negotiation-related tests for fetching with bundle-uri has been - included. + A set of negotiation-related tests for cloning with bundle-uri has been + included to demonstrate that downloaded bundles are utilized to + accelerate fetching. + + Additionally, another test has been added to show that bundles with + incorrect headers, where refs point to non-existent objects, do not + result in any bundle refs being created in the repository. Reviewed-by: Karthik Nayak <karthik.188@xxxxxxxxx> Reviewed-by: Patrick Steinhardt <ps@xxxxxx> @@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *fi bundle_header_release(&header); ## t/t5558-clone-bundle-uri.sh ## +@@ + test_description='test fetching bundles with --bundle-uri' + + . ./test-lib.sh ++. "$TEST_DIRECTORY"/lib-bundle.sh + + test_expect_success 'fail to clone from non-existent file' ' + test_when_finished rm -rf test && @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'fail to clone from non-bundle file' ' test_expect_success 'create bundle' ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'fail to clone from non-bundle + git bundle create B.bundle topic && + + # Create a bundle with reference pointing to non-existent object. -+ sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle ++ sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \ ++ <A.bundle >bad-header.bundle && ++ convert_bundle_to_pack \ ++ <A.bundle >>bad-header.bundle + ) ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with path bundle' ' ' +test_expect_success 'clone with bundle that has bad header' ' ++ # Write bundle ref fails, but clone can still proceed. + git clone --bundle-uri="clone-from/bad-header.bundle" \ + clone-from clone-bad-header 2>err && -+ # Write bundle ref fails, but clone can still proceed. + commit_b=$(git -C clone-from rev-parse B) && + test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err && + git -C clone-bad-header for-each-ref --format="%(refname)" >refs && @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m ! grep "refs/bundles/" refs ' -+######################################################################### -+# Clone negotiation related tests begin here -+ +test_expect_success 'negotiation: bundle with part of wanted commits' ' -+ test_when_finished rm -rf trace*.txt && ++ test_when_finished "rm -f trace*.txt" && + GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \ + git clone --no-local --bundle-uri="clone-from/A.bundle" \ + clone-from nego-bundle-part && + git -C nego-bundle-part for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && -+ cat >expect <<-\EOF && -+ refs/bundles/topic -+ EOF ++ test_write_lines refs/bundles/topic >expect && + test_cmp expect actual && + # Ensure that refs/bundles/topic are sent as "have". -+ grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt ++ test_grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt +' + +test_expect_success 'negotiation: bundle with all wanted commits' ' -+ test_when_finished rm -rf trace*.txt && ++ test_when_finished "rm -f trace*.txt" && + GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \ + git clone --no-local --single-branch --branch=topic --no-tags \ + --bundle-uri="clone-from/B.bundle" \ + clone-from nego-bundle-all && + git -C nego-bundle-all for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && -+ cat >expect <<-\EOF && -+ refs/bundles/topic -+ EOF ++ test_write_lines refs/bundles/topic >expect && + test_cmp expect actual && + # We already have all needed commits so no "want" needed. + ! grep "clone> want " trace-packet.txt +' + +test_expect_success 'negotiation: bundle list (no heuristic)' ' -+ test_when_finished rm -f trace*.txt && ++ test_when_finished "rm -f trace*.txt" && + cat >bundle-list <<-EOF && + [bundle] + version = 1 @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m + refs/bundles/left + EOF + test_cmp expect actual && -+ grep "clone> have $(git -C nego-bundle-list-no-heuristic rev-parse refs/bundles/left)" trace-packet.txt ++ test_grep "clone> have $(git -C nego-bundle-list-no-heuristic rev-parse refs/bundles/left)" trace-packet.txt +' + +test_expect_success 'negotiation: bundle list (creationToken)' ' -+ test_when_finished rm -f trace*.txt && ++ test_when_finished "rm -f trace*.txt" && + cat >bundle-list <<-EOF && + [bundle] + version = 1 @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m + refs/bundles/left + EOF + test_cmp expect actual && -+ grep "clone> have $(git -C nego-bundle-list-heuristic rev-parse refs/bundles/left)" trace-packet.txt ++ test_grep "clone> have $(git -C nego-bundle-list-heuristic rev-parse refs/bundles/left)" trace-packet.txt +' + +test_expect_success 'negotiation: bundle list with all wanted commits' ' -+ test_when_finished rm -f trace*.txt && ++ test_when_finished "rm -f trace*.txt" && + cat >bundle-list <<-EOF && + [bundle] + version = 1 2: d21c236b8de = 2: 3dc0d9dd22f fetch-pack: expose fsckObjects configuration logic 3: 53395e8c08a ! 3: 2f15099bbb9 unbundle: support object verification for fetches @@ Metadata Author: Xing Xin <xingxin.xx@xxxxxxxxxxxxx> ## Commit message ## - unbundle: support object verification for fetches + unbundle: extend object verification for fetches - This commit extends object verification support for fetches in - `bundle.c:unbundle` by adding the `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` - option to `verify_bundle_flags`. When this option is enabled, - `bundle.c:unbundle` invokes `fetch-pack.c:fetch_pack_fsck_objects` to - determine whether to append the "--fsck-objects" flag to - "git-index-pack". + The existing fetch.fsckObjects and transfer.fsckObjects configurations + were not fully applied to bundle-involved fetches, including direct + bundle fetches and bundle-uri enabled fetches. Furthermore, there was no + object verification support for unbundle. - `VERIFY_BUNDLE_FSCK_FOLLOW_FETCH` is now passed to `unbundle` in the - fetching process, including: + This commit extends object verification support in `bundle.c:unbundle` + by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When + this option is enabled, we append the `--fsck-objects` flag to + `git-index-pack`. + + The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches, + where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether + to enable this option for `bundle.c:unbundle`, specifically in: - `transport.c:fetch_refs_from_bundle` for direct bundle fetches. - `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches. This addition ensures a consistent logic for object verification during - fetch operations. Tests have been added to confirm functionality in the - scenarios mentioned above. + fetches. Tests have been added to confirm functionality in the scenarios + mentioned above. Reviewed-by: Patrick Steinhardt <ps@xxxxxx> Signed-off-by: Xing Xin <xingxin.xx@xxxxxxxxxxxxx> ## bundle-uri.c ## +@@ + #include "hashmap.h" + #include "pkt-line.h" + #include "config.h" ++#include "fetch-pack.h" + #include "remote.h" + + static struct { @@ bundle-uri.c: static int unbundle_from_file(struct repository *r, const char *file) * the prerequisite commits. */ if ((result = unbundle(r, &header, bundle_fd, NULL, - VERIFY_BUNDLE_QUIET))) -+ VERIFY_BUNDLE_QUIET | VERIFY_BUNDLE_FSCK_FOLLOW_FETCH))) ++ VERIFY_BUNDLE_QUIET | (fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0)))) return 1; /* ## bundle.c ## -@@ - #include "list-objects-filter-options.h" - #include "connected.h" - #include "write-or-die.h" -+#include "fetch-pack.h" - - static const char v2_bundle_signature[] = "# v2 git bundle\n"; - static const char v3_bundle_signature[] = "# v3 git bundle\n"; @@ bundle.c: int unbundle(struct repository *r, struct bundle_header *header, if (header->filter.choice) strvec_push(&ip.args, "--promisor=from-bundle"); -+ if (flags & VERIFY_BUNDLE_FSCK_FOLLOW_FETCH) -+ if (fetch_pack_fsck_objects()) -+ strvec_push(&ip.args, "--fsck-objects"); ++ if (flags & VERIFY_BUNDLE_FSCK) ++ strvec_push(&ip.args, "--fsck-objects"); + if (extra_index_pack_args) { strvec_pushv(&ip.args, extra_index_pack_args->v); @@ bundle.h: int create_bundle(struct repository *r, const char *path, enum verify_bundle_flags { VERIFY_BUNDLE_VERBOSE = (1 << 0), VERIFY_BUNDLE_QUIET = (1 << 1), -+ VERIFY_BUNDLE_FSCK_FOLLOW_FETCH = (1 << 2), ++ VERIFY_BUNDLE_FSCK = (1 << 2), }; int verify_bundle(struct repository *r, struct bundle_header *header, ## t/t5558-clone-bundle-uri.sh ## @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'create bundle' ' - git bundle create B.bundle topic && - - # Create a bundle with reference pointing to non-existent object. -- sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle -+ sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle && + sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \ + <A.bundle >bad-header.bundle && + convert_bundle_to_pack \ +- <A.bundle >>bad-header.bundle ++ <A.bundle >>bad-header.bundle && + + cat >data <<-EOF && + tree $(git rev-parse HEAD^{tree}) @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with bundle that has bad + clone-from clone-bad-object-no-fsck && + git -C clone-bad-object-no-fsck for-each-ref --format="%(refname)" >refs && + grep "refs/bundles/" refs >actual && -+ cat >expect <<-\EOF && -+ refs/bundles/bad -+ EOF ++ test_write_lines refs/bundles/bad >expect && + test_cmp expect actual && + + # Unbundle fails with fsckObjects set true, but clone can still proceed. @@ transport.c: static int fetch_refs_from_bundle(struct transport *transport, get_refs_from_bundle_inner(transport); ret = unbundle(the_repository, &data->header, data->fd, - &extra_index_pack_args, 0); -+ &extra_index_pack_args, VERIFY_BUNDLE_FSCK_FOLLOW_FETCH); ++ &extra_index_pack_args, ++ fetch_pack_fsck_objects() ? VERIFY_BUNDLE_FSCK : 0); transport->hash_algo = data->header.hash_algo; return ret; } -- gitgitgadget