Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes: > When fetching (as opposed to cloning) from a repository with packfile > URIs enabled, an error like this may occur: > > fatal: pack has bad object at offset 12: unknown object type 5 > fatal: finish_http_pack_request gave result -1 > fatal: fetch-pack: expected keep then TAB at start of http-fetch output > > This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs, > use index-pack arg", 2021-02-22), when the index-pack args used when > processing the inline packfile of a fetch response and when processing > packfile URIs were unified. > This bug happens because fetch, by default, partially reads (and > consumes) the header of the inline packfile to determine if it should > store the downloaded objects as a packfile or loose objects, and thus > passes --pack_header=<...> to index-pack to inform it that some bytes > are missing. ... and what the values in them are. > However, when it subsequently fetches the additional > packfiles linked by URIs, it reuses the same index-pack arguments, thus > wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are > missing. > > This does not happen when cloning because "git clone" always passes > do_keep, which instructs the fetch mechanism to always retain the > packfile, eliminating the need to read the header. > > There are a few ways to fix this, including filtering out pack_header > arguments when downloading the additional packfiles, but ... Avoiding the condition that exhibits the breakage is possible, and I think it is what is done here, but I actually think that the only right fix is to pass correct argument to commands we invoke in the first place. Why are we reusing the same argument array to begin with? ... goes back and reads the offending commit ... commit b664e9ffa153189dae9b88f32d1c5fedcf85056a Author: Jonathan Tan <jonathantanmy@xxxxxxxxxx> Date: Mon Feb 22 11:20:08 2021 -0800 fetch-pack: with packfile URIs, use index-pack arg Unify the index-pack arguments used when processing the inline pack and when downloading packfiles referenced by URIs. This is done by teaching get_pack() to also store the index-pack arguments whenever at least one packfile URI is given, and then when processing the packfile URI(s), using the stored arguments. THis makes it sound like the entire idea of this offending commit was wrong, and before it, the codepath that processed the packfile fetched from the packfile URI were using the index-pack correctly by using index-pack arguments that are independent from the one that is used to process the packfile given in-stream. Why isn't the fix just a straight revert of the commit??? > This is on jt/transfer-fsck-across-packs. Ouch. This definitely is an -rc material. > - if (!args->keep_pack && unpack_limit) { > + if (!args->keep_pack && unpack_limit && !index_pack_args) { This one makes sense as an "avoid conditions that reveals how badly the code is broken" band-aid. When we have index-pack related arguments, we cannot use the unpack-objects codepath even if we are being fed a tiny pack, so there is no point peeking at the beginning of the pack stream to find out how many objects it has. OK. > @@ -885,7 +885,7 @@ static int get_pack(struct fetch_pack_args *args, > strvec_push(&cmd.args, "-v"); > if (args->use_thin_pack) > strvec_push(&cmd.args, "--fix-thin"); > - if (do_keep && (args->lock_pack || unpack_limit)) { > + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { > char hostname[HOST_NAME_MAX + 1]; > if (xgethostname(hostname, sizeof(hostname))) > xsnprintf(hostname, sizeof(hostname), "localhost"); I do not quite get what this hunk is doing. Care to explain? Thanks. > diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh > index b1bc73a9a9..9df1ec82ca 100755 > --- a/t/t5702-protocol-v2.sh > +++ b/t/t5702-protocol-v2.sh > @@ -853,6 +853,27 @@ test_expect_success 'part of packfile response provided as URI' ' > test_line_count = 6 filelist > ' > > +test_expect_success 'packfile URIs with fetch instead of clone' ' > + P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && > + rm -rf "$P" http_child log && > + > + git init "$P" && > + git -C "$P" config "uploadpack.allowsidebandall" "true" && > + > + echo my-blob >"$P/my-blob" && > + git -C "$P" add my-blob && > + git -C "$P" commit -m x && > + > + configure_exclusion "$P" my-blob >h && > + > + git init http_child && > + > + GIT_TEST_SIDEBAND_ALL=1 \ > + git -C http_child -c protocol.version=2 \ > + -c fetch.uriprotocols=http,https \ > + fetch "$HTTPD_URL/smart/http_parent" > +' > + > test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' > P="$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && > rm -rf "$P" http_child log &&