Re: [PATCH/RFC 4/6] transport: add refspec list parameters to functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 19, 2016 at 05:40:01PM -0400, David Turner wrote:

> > I dunno, I am a bit negative on bringing new features to Git-over
> > -HTTP
> > (which is already less efficient than the other protocols!) without
> > any
> > plan for supporting them in the other protocols.
> 
> Interesting -- can you expand on git-over-http being less efficient?
> This is the first I'd heard of it.  Is it documented somewhere?

I don't know offhand of thorough discussion I can link to. But
basically, the issue is the tip negotiation that happens during a fetch.
In the normal git-over-ssh and git-over-tcp protocols, we're
full-duplex, and both sides remember all of the state. So the client is
spewing "want" and "have" lines at the server, which is responding
asynchronously with acks or naks until they reach a shared point to
generate the pack from.

In the HTTP protocol, this negotiation has to happen via synchronous
request/response pairs. So the client says "here are some haves; what do
you think?" and gets back the response. Then it prepares another of
haves, and so on, until the server says "OK, I've seen enough; here's
the pack". But because the server is stateless, each request has to
summarize the findings of the prior request. And so each request gets
slightly bigger as we iterate.

There are some tunable parameters there (e.g., how many haves to send in
the first batch?), and the current settings are meant to be a mix of not
wasting too much time preparing a request, but also putting enough into
it that common requests can complete with only a single round trip.

I don't have numbers on how often we have to fall back multiple
requests, or how big they can grow. I know I have very occasionally seen
pathological cases where we outgrew the HTTP buffer sizes, and re-trying
the fetch via ssh just worked.

I'm cc-ing Shawn, who designed all of this, and can probably give more
details (and may also have opinions on new http-only protocol features,
as he'd probably end up implementing them in JGit, too).

It would be nice if we could do a true full-duplex conversation over
HTTP. I looked into Websockets at one point, but IIRC there wasn't
libcurl support for them.

> > So I'd rather see something like:
> > 
> >   1. Support for v2 "capabilities only" initial negotiation, followed
> >      by ref advertisement.
> > 
> >   2. Support for refspec-limiting capability.
> > 
> >   3. HTTP-only option from client to trigger v2 on the server.
> > 
> > That's still HTTP-specific, but it has a clear path for converging
> > with
> > the ssh and git protocols eventually, rather than having to support
> > magic out-of-band capabilities forever.
> > 
> > It does require an extra round of HTTP request/response, though.
> 
> This seems way more complicated to me, and not necessarily super
> -efficient.  That is, it seems like rather a lot of work to add a whole
> round of negotiation and a new protocol, when all we really need is one
> little tweak.

It is less efficient because of the extra round. If the new protocol
were truly client-speaks-first, we could drop that round (which is
essentially what your proposal is doing; you're just sticking the
first-speak part into HTTP parameters).

I don't know how much that round costs if it's part of the same TCP
session, or part of the same pipelined HTTP connection.

> I wonder if it would be possible to just add these tweaks to v1, and
> save the v2 work for when someone has the time to implement it?

I don't think it's possible for the non-HTTP protocols. The single
change in v2 is to add a phase before the ref advertisement starts.
Without that, the server is going to start spewing advertisements.

You can find previous discussion on the list, but I think the options
basically are:

  1. Something like v2, where the client gets a chance to speak before
     the advertisement.

  2. Some out-of-band way of getting values from the client to the
     server (so maybe extra command-line arguments for git-over-ssh, and
     maybe shoving something after the "\0" for git-daemon, and of
     course extra parameters for HTTP).

  3. The client saying "stop spewing refs at me, I want to give you a
     ref filter" asynchronously, and accepting a little spew at the
     beginning of each conversation. That obviously only works for the
     full-duplex transports, so you'd probably fall back to (2) for
     http.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]