Re: [RFC PATCH] upload-pack: fix filter options scope

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 7, 2020 at 1:36 PM Jeff King <peff@xxxxxxxx> wrote:
>
> On Thu, May 07, 2020 at 11:58:29AM +0200, Christian Couder wrote:
>
> > I am not sure why upload_pack_v2() is called twice. Also it seems
> > that another similar issue might not be fixed by this. So this patch
> > is RFC for now.
>
> It's because of the request/response model of v2. The client side of the
> conversation looks something like:
>
>   - client connects transport to server
>
>   - client issues command=ls-refs to get the ref advertisement
>
>   - client issues command=fetch and sends wants/have; the server may
>     respond with a pack when the negotiation is finished, but may also
>     respond with acks, etc, asking for more rounds of negotiation
>
>   - if there's no pack, the client then issues another command=fetch,
>     repeating until we get a pack
>
> In stateless git-over-http, each of those request/response pairs happens in
> its own server-side process, because the webserver kicks off a separate
> CGI for each request. But v2 adapts git://, ssh://, and direct-pipe
> connections to use the same request/response model, but all connected to
> a single Git process on the other side.
>
> So each upload_pack_v2() call needs to start with a clean slate; it
> can't assume anything about previous rounds, and it needs to clean up
> any allocated data from those rounds.
>
> So your patch is going in the right direction, but we already have other
> similar data handled by upload_pack_data, which has its own init/clear
> functions. Probably the filter data should go in there, too.

Thank you for the detailed explanations!

It looks like 'struct upload_pack_data' is indeed used by
upload_pack_v2(). It's not used by upload_pack() though. So if I move
'struct list_objects_filter_options filter_options' there, I would
need to also make upload_pack() either have its own filter_options or
use 'struct upload_pack_data' too.

> This is an easy bug to introduce, and I suspect we'll see equivalent
> ones in the future (if there aren't already more lurking; there's a lot
> of global data in Git, and I really have no idea what subtle
> interactions one could see). One thing we could do is to truly run each
> v2 command request in its own process, just like git-over-http does. But
> there are a lot of complications there around holding the pipe/socket
> open, how the parent v2 serving process interacts with the child
> requests (e.g., noticing errors), etc. So I'd worry that going that
> direction would be just as likely to introduce bugs as fix them.

I agree that it seems better to just get rid of the global data used
by the upload pack functions.

> > diff --git a/upload-pack.c b/upload-pack.c
> > index 902d0ad5e1..4e22e46294 100644
> > --- a/upload-pack.c
> > +++ b/upload-pack.c
> > @@ -68,7 +68,6 @@ static const char *pack_objects_hook;
> >  static int filter_capability_requested;
> >  static int allow_filter;
> >  static int allow_ref_in_want;
> > -static struct list_objects_filter_options filter_options;
>
> Just looking at nearby bits of global data that might want similar
> treatment:
>
>  - allow_filter and allow_ref_in_want are set by the server-side config.
>    So they're persisted between upload_pack_v2() calls, but we'd
>    overwrite them by calling git_config() in each incarnation
>
>  - filter_capability_requested is set based on client input, but it's
>    only used in v1
>
> But boy there are a lot of global variables there to trace through.

Yeah, unfortunately.

> Probably a more fruitful way to find similar problems is to look at the
> v2 process_args() to see what it touches. It looks like options like
> ofs_delta probably ought to be part of upload_pack_data, too. We didn't
> notice because it would require a client doing something like:
>
>   command=fetch
>   ofs-delta    [claims to support ofs-delta]
>   0001
>   [wants and haves that don't result in a pack yet...]
>   0000
>   command=fetch
>   [do not claim to support ofs-delta!]
>   0001
>   [wants and haves that result in a pack]
>   0000
>
> The server _shouldn't_ use ofs-delta there (and wouldn't over http,
> where the stateless server side that generates the pack knows only about
> that second request). But in git-over-ssh, it would.
>
> A client who does this is stupid and wrong (and I didn't check the
> protocol spec, but it could very well be violating the spec). So I don't
> think it's _that_ big a deal. But it would be nice if all of those
> globals got moved into upload_pack_data, and both v1 and v2 just used it
> consistently.

Unfortunately as I discuss a bit above 'struct upload_pack_data' is
not used by v1, only by v2. I think making v1 use upload_pack_data
might be nice, but it's a separate issue. So I am tempted to just
improve the commit message, adding some information from you and from
this discussion, and then re-sending.

Another intermediate solution would be to add filter_options to
'struct upload_pack_data' for v2 and to use filter_options directly
for v1.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux