Re: [PATCH] fetch-pack: do not mix --pack_header and packfile uri

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes:
>
>>> I dunno how involved the necessary surgery would be, though.  If
>>> this is easy to work around, perhaps it might be a better option for
>>> the overall project to ship the upcoming release with this listed as
>>> a known breakage.
>>
>> I don't think it's too difficult - I think we'll only need to filter out
>> the --pack_header when we figure out the arguments to pass for the
>> packfiles given by URI. I'll take a look.
>
> What you sent earlier is a much better band-aid than "keep the
> single args array but filter an element out in only one codepath"
> band-aid, I would think.
>
> Any change that is more involved than a single-liner trivial bugfix
> would be too late for this cycle, as we'd be cutting -rc2 by the end
> of tomorrow.

I was looking at the index_pack_args vs pass_header codepath in
fetch-pack.c again after finishing the -rc2 stuff, and noticed
something curious.

Before running the command to process in-stream packdata, we have
this bit:

	if (index_pack_args) {
		int i;

		for (i = 0; i < cmd.args.nr; i++)
			strvec_push(index_pack_args, cmd.args.v[i]);
	}

where cmd.args is what the original code (before the "we need to
prepare the index pack arguments for the offline HTTP transfer"
logic was bolted onto this codepath), so it could of course have
things like "--fix-thin", "--promisor", when we are processing an
in-stream packfile that has sufficiently large number of objects and
choose "index-pack" to process it.  None of them should be given to
the "index-pack" that processes the offline packfile that is given
via the packfile URI mechanism.

Also, because this loop copies everything in cmd.args, if our
in-stream packdata is small, cmd.args.v[0] would be "unpack-objects",
and we end up asking the command to explode the (presumably large
enough to be worth pre-generating and serving via CDN) packfile that
is given via the packfile URI mechanism.

What I think I am seeing in the code is that there are many things
other than "pass_header" that fundamentally cannot be reused between
the processing of the in-stream packdata and the offline packfile
given by the packfile URI (e.g. the in-stream one may want to use
"unpack-objects" to avoid accumulating too many tiny packs, so there
is nothing to be shared with "index-pack" that will always be used
for the offline one), and any attempt to "reuse" cmd.args while
"filtering out" inappropriate bits is fragile and unfruitful.

Instead, I think we should not touch index_pack in the earlier part
of the function at all (both reading, writing, or even checking for
NULL-ness), and use the "if (index_pack_args)" block we already have
(i.e. the one before we call start_command() to process the
in-stream packdata) to decide what the command line to process the
offline pack should look like.  That way, we won't ever risk such a
confusion like running "unpack-objects" instead of "index-pack" (but
we can choose to do so deliberately, of course---the important point
is to recognise that the in-stream pack and the offline one are
independant and we should decide how to cook them separately).




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux