> Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes: > > > When fetching (as opposed to cloning) from a repository with packfile > > URIs enabled, an error like this may occur: > > > > fatal: pack has bad object at offset 12: unknown object type 5 > > fatal: finish_http_pack_request gave result -1 > > fatal: fetch-pack: expected keep then TAB at start of http-fetch output > > > > This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs, > > use index-pack arg", 2021-02-22), when the index-pack args used when > > processing the inline packfile of a fetch response and when processing > > packfile URIs were unified. > > > This bug happens because fetch, by default, partially reads (and > > consumes) the header of the inline packfile to determine if it should > > store the downloaded objects as a packfile or loose objects, and thus > > passes --pack_header=<...> to index-pack to inform it that some bytes > > are missing. > > ... and what the values in them are. Ah, that's true. > > However, when it subsequently fetches the additional > > packfiles linked by URIs, it reuses the same index-pack arguments, thus > > wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are > > missing. > > > > This does not happen when cloning because "git clone" always passes > > do_keep, which instructs the fetch mechanism to always retain the > > packfile, eliminating the need to read the header. > > > > There are a few ways to fix this, including filtering out pack_header > > arguments when downloading the additional packfiles, but ... > > Avoiding the condition that exhibits the breakage is possible, and I > think it is what is done here, but I actually think that the only > right fix is to pass correct argument to commands we invoke in the > first place. Why are we reusing the same argument array to begin > with? > > ... goes back and reads the offending commit ... > > commit b664e9ffa153189dae9b88f32d1c5fedcf85056a > Author: Jonathan Tan <jonathantanmy@xxxxxxxxxx> > Date: Mon Feb 22 11:20:08 2021 -0800 > > fetch-pack: with packfile URIs, use index-pack arg > > Unify the index-pack arguments used when processing the inline pack and > when downloading packfiles referenced by URIs. This is done by teaching > get_pack() to also store the index-pack arguments whenever at least one > packfile URI is given, and then when processing the packfile URI(s), > using the stored arguments. > > THis makes it sound like the entire idea of this offending commit > was wrong, and before it, the codepath that processed the packfile > fetched from the packfile URI were using the index-pack correctly > by using index-pack arguments that are independent from the one that > is used to process the packfile given in-stream. Why isn't the fix > just a straight revert of the commit??? I should probably have written more in the commit message to justify the unification, but it is also part of a bug fix (in particular, --fsck-objects wasn't being passed to the index-pack that indexed the packfiles linked by URI) and for code health purposes (to prevent future bugs by eliminating the divergence). So reverting that commit would reintroduce another bug. > > @@ -885,7 +885,7 @@ static int get_pack(struct fetch_pack_args *args, > > strvec_push(&cmd.args, "-v"); > > if (args->use_thin_pack) > > strvec_push(&cmd.args, "--fix-thin"); > > - if (do_keep && (args->lock_pack || unpack_limit)) { > > + if ((do_keep || index_pack_args) && (args->lock_pack || unpack_limit)) { > > char hostname[HOST_NAME_MAX + 1]; > > if (xgethostname(hostname, sizeof(hostname))) > > xsnprintf(hostname, sizeof(hostname), "localhost"); > > I do not quite get what this hunk is doing. Care to explain? The "do_keep" part was unnecessarily restrictive and I used a band-aid solution to loosen it. I think this started from 88e2f9ed8e ("introduce fetch-object: fetch one promisor object", 2017-12-05) where I might have misunderstood what do_keep was meant to do, and taught fetch-pack to use "index-pack" if do_keep is true or args->from_promisor is true. What I should have done is to set do_keep to true if args->from_promisor is true. Future commits continued to do that with fsck_objects and index_pack_args. Maybe what I can do is to refactor get_pack() so that do_keep retains its original meaning of whether to use "index-pack" or "unpack-objects", and then we wouldn't need this line. What do you think (code-wise and whether this fits in with the release schedule, if we want to get this in before release)?