While attempting to fix a reference negotiation bug in bundle-uri, we identified that the fetch process lacks some crucial object validation checks when processing bundles. The primary issues are: 1. In the bundle-uri scenario, object IDs were not validated before writing bundle references. This was the root cause of the original negotiation bug in bundle-uri and could lead to potential repository corruption. 2. The existing "fetch.fsckObjects" and "transfer.fsckObjects" configurations were not applied when directly fetching bundles or fetching with bundle-uri enabled. In fact, there were no object validation supports for unbundle. The first patch addresses the bundle-uri negotiation issue by removing the REF_SKIP_OID_VERIFICATION flag when writing bundle references. Patches 2 through 3 extend verify_bundle_flags for bundle.c:unbundle to add support for object validation (fsck) in fetch scenarios, mainly following the suggestions from Junio and Patrick on the mailing list. Xing Xin (3): bundle-uri: verify oid before writing refs fetch-pack: expose fsckObjects configuration logic unbundle: extend object verification for fetches bundle-uri.c | 6 +- bundle.c | 3 + bundle.h | 1 + fetch-pack.c | 17 ++-- fetch-pack.h | 5 + t/t5558-clone-bundle-uri.sh | 187 +++++++++++++++++++++++++++++++++++- t/t5607-clone-bundle.sh | 35 +++++++ transport.c | 3 +- 8 files changed, 243 insertions(+), 14 deletions(-) base-commit: b9cfe4845cb2562584837bc0101c0ab76490a239 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1730%2Fblanet%2Fxx%2Fbundle-uri-bug-using-bundle-list-v8 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1730/blanet/xx/bundle-uri-bug-using-bundle-list-v8 Pull-Request: https://github.com/gitgitgadget/git/pull/1730 Range-diff vs v7: 1: fc9f44fda00 ! 1: d8fbde2dcd4 bundle-uri: verify oid before writing refs @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'fail to clone from non-bundle + git bundle create B.bundle topic && + + # Create a bundle with reference pointing to non-existent object. -+ sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \ ++ commit_a=$(git rev-parse A) && ++ commit_b=$(git rev-parse B) && ++ sed -e "/^$/q" -e "s/$commit_a /$commit_b /" \ + <A.bundle >bad-header.bundle && + convert_bundle_to_pack \ + <A.bundle >>bad-header.bundle @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with path bundle' ' + commit_b=$(git -C clone-from rev-parse B) && + test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err && + git -C clone-bad-header for-each-ref --format="%(refname)" >refs && -+ ! grep "refs/bundles/" refs ++ test_grep ! "refs/bundles/" refs +' + test_expect_success 'clone with path bundle and non-default hash' ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m + test_write_lines refs/bundles/topic >expect && + test_cmp expect actual && + # Ensure that refs/bundles/topic are sent as "have". -+ test_grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt ++ tip=$(git -C clone-from rev-parse A) && ++ test_grep "clone> have $tip" trace-packet.txt +' + +test_expect_success 'negotiation: bundle with all wanted commits' ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m + test_write_lines refs/bundles/topic >expect && + test_cmp expect actual && + # We already have all needed commits so no "want" needed. -+ ! grep "clone> want " trace-packet.txt ++ test_grep ! "clone> want " trace-packet.txt +' + +test_expect_success 'negotiation: bundle list (no heuristic)' ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m + refs/bundles/left + EOF + test_cmp expect actual && -+ test_grep "clone> have $(git -C nego-bundle-list-no-heuristic rev-parse refs/bundles/left)" trace-packet.txt ++ tip=$(git -C nego-bundle-list-no-heuristic rev-parse refs/bundles/left) && ++ test_grep "clone> have $tip" trace-packet.txt +' + +test_expect_success 'negotiation: bundle list (creationToken)' ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m + refs/bundles/left + EOF + test_cmp expect actual && -+ test_grep "clone> have $(git -C nego-bundle-list-heuristic rev-parse refs/bundles/left)" trace-packet.txt ++ tip=$(git -C nego-bundle-list-heuristic rev-parse refs/bundles/left) && ++ test_grep "clone> have $tip" trace-packet.txt +' + +test_expect_success 'negotiation: bundle list with all wanted commits' ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone bundle list (file, any m + EOF + test_cmp expect actual && + # We already have all needed commits so no "want" needed. -+ ! grep "clone> want " trace-packet.txt ++ test_grep ! "clone> want " trace-packet.txt +' + ######################################################################### 2: 3dc0d9dd22f ! 2: 518584c8698 fetch-pack: expose fsckObjects configuration logic @@ Commit message "fetch.fsckObjects" to control checks for broken objects in received packs during fetches. However, these configurations were only acknowledged by `fetch-pack.c:get_pack` and did not take effect in - direct bundle fetches and fetches with _bundle-uri_ enabled. + direct bundle fetches or fetches with _bundle-uri_ enabled. This commit exposes the fetch-then-transfer configuration logic by adding a new function `fetch_pack_fsck_objects` in fetch-pack.h. This new function is used to replace the assignment for `fsck_objects` in - `fetch-pack.c:get_pack`. In the next commit, it will also be used by - `bundle.c:unbundle` to better fit fetching scenarios. + `fetch-pack.c:get_pack`. In the next commit, this function will also be + used to extend fsck support for bundle-involved fetches. Helped-by: Junio C Hamano <gitster@xxxxxxxxx> Helped-by: Patrick Steinhardt <ps@xxxxxx> 3: 2f15099bbb9 ! 3: 698dd6e49b7 unbundle: extend object verification for fetches @@ bundle.h: int create_bundle(struct repository *r, const char *path, ## t/t5558-clone-bundle-uri.sh ## @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'create bundle' ' - sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \ + sed -e "/^$/q" -e "s/$commit_a /$commit_b /" \ <A.bundle >bad-header.bundle && convert_bundle_to_pack \ - <A.bundle >>bad-header.bundle + <A.bundle >>bad-header.bundle && + ++ tree_b=$(git rev-parse B^{tree}) && + cat >data <<-EOF && -+ tree $(git rev-parse HEAD^{tree}) -+ parent $(git rev-parse HEAD) ++ tree $tree_b ++ parent $commit_b + author A U Thor + committer A U Thor + + commit: this is a commit with bad emails + + EOF -+ git hash-object --literally -t commit -w --stdin <data >commit && -+ git branch bad $(cat commit) && ++ bad_commit=$(git hash-object --literally -t commit -w --stdin <data) && ++ git branch bad $bad_commit && + git bundle create bad-object.bundle bad && + git update-ref -d refs/heads/bad ) ' @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with bundle that has bad header' ' - ! grep "refs/bundles/" refs + test_grep ! "refs/bundles/" refs ' +test_expect_success 'clone with bundle that has bad object' ' -+ # Unbundle succeeds if no fsckObjects confugured. ++ # Unbundle succeeds if no fsckObjects configured. + git clone --bundle-uri="clone-from/bad-object.bundle" \ + clone-from clone-bad-object-no-fsck && + git -C clone-bad-object-no-fsck for-each-ref --format="%(refname)" >refs && @@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with bundle that has bad + clone-from clone-bad-object-fsck 2>err && + test_grep "missingEmail" err && + git -C clone-bad-object-fsck for-each-ref --format="%(refname)" >refs && -+ ! grep "refs/bundles/" refs ++ test_grep ! "refs/bundles/" refs +' + test_expect_success 'clone with path bundle and non-default hash' ' @@ t/t5607-clone-bundle.sh: test_expect_success 'fetch SHA-1 from bundle' ' + test_create_repo bundle-fsck && + ( + cd bundle-fsck && -+ test_commit first && ++ test_commit A && ++ commit_a=$(git rev-parse A) && ++ tree_a=$(git rev-parse A^{tree}) && + cat >data <<-EOF && -+ tree $(git rev-parse HEAD^{tree}) -+ parent $(git rev-parse HEAD) ++ tree $tree_a ++ parent $commit_a + author A U Thor + committer A U Thor + + commit: this is a commit with bad emails + + EOF -+ git hash-object --literally -t commit -w --stdin <data >commit && -+ git branch bad $(cat commit) && ++ bad_commit=$(git hash-object --literally -t commit -w --stdin <data) && ++ git branch bad $bad_commit && + git bundle create bad.bundle bad + ) && + -- gitgitgadget