When fetching (as opposed to cloning) from a repository with packfile URIs enabled, an error like this may occur: fatal: pack has bad object at offset 12: unknown object type 5 fatal: finish_http_pack_request gave result -1 fatal: fetch-pack: expected keep then TAB at start of http-fetch output This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs, use index-pack arg", 2021-02-22), when the index-pack args used when processing the inline packfile of a fetch response and when processing packfile URIs were unified. This bug happens because fetch, by default, partially reads (and consumes) the header of the inline packfile to determine if it should store the downloaded objects as a packfile or loose objects, and thus passes --pack_header=<...> to index-pack to inform it that some bytes are missing. However, when it subsequently fetches the additional packfiles linked by URIs, it reuses the same index-pack arguments, thus wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are missing. This does not happen when cloning because "git clone" always passes do_keep, which instructs the fetch mechanism to always retain the packfile, eliminating the need to read the header. There are a few ways to fix this, including filtering out pack_header arguments when downloading the additional packfiles, but I decided to stick to always using index-pack throughout when packfile URIs are present - thus, Git no longer needs to read the bytes, and no longer needs --pack_header here. Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx> --- Here's a fix for this issue. This is on jt/transfer-fsck-across-packs. One simplification that we could do is to eliminate the unpack-objects codepath. As far as I understand, the main advantage of writing loose objects is that we have automatic SHA-1 collision detection, but we have such mitigations when writing packs too, so that might not be as large a benefit as we think. This simplification would have enabled us to avoid this bug, I think. --- fetch-pack.c | 4 ++-- t/t5702-protocol-v2.sh | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fetch-pack.c b/fetch-pack.c index f9def5ac74..e990607742 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -852,7 +852,7 @@ static int get_pack(struct fetch_pack_args *args, else demux.out = xd[0]; - if (!args->keep_pack && unpack_limit) { + if (!args->keep_pack && unpack_limit && !index_pack_args) { if (read_pack_header(demux.out, &header)) die(_("protocol error: bad pack header")); @@ -885,7 +885,7 @@ static int get_pack(struct fetch_pack_args *args, strvec_push(&cmd.args, "-v"); if (args->use_thin_pack) strvec_push(&cmd.args, "--fix-thin"); - if (do_keep && (args->lock_pack || unpack_limit)) { + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { char hostname[HOST_NAME_MAX + 1]; if (xgethostname(hostname, sizeof(hostname))) xsnprintf(hostname, sizeof(hostname), "localhost"); diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index b1bc73a9a9..9df1ec82ca 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -853,6 +853,27 @@ test_expect_success 'part of packfile response provided as URI' ' test_line_count = 6 filelist ' +test_expect_success 'packfile URIs with fetch instead of clone' ' + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && + rm -rf "$P" http_child log && + + git init "$P" && + git -C "$P" config "uploadpack.allowsidebandall" "true" && + + echo my-blob >"$P/my-blob" && + git -C "$P" add my-blob && + git -C "$P" commit -m x && + + configure_exclusion "$P" my-blob >h && + + git init http_child && + + GIT_TEST_SIDEBAND_ALL=1 \ + git -C http_child -c protocol.version=2 \ + -c fetch.uriprotocols=http,https \ + fetch "$HTTPD_URL/smart/http_parent" +' + test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && rm -rf "$P" http_child log && -- 2.30.1.766.gb4fecdf3b7-goog