Re: upload-pack is slow with lots of refs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 8, 2012 at 8:05 AM, Johannes Sixt <j6t@xxxxxxxx> wrote:
> Am 05.10.2012 18:57, schrieb Shawn Pearce:
>> On Thu, Oct 4, 2012 at 11:24 PM, Johannes Sixt <j.sixt@xxxxxxxxxxxxx> wrote:
>>> Upload-pack can just start
>>> advertising refs in the "v1" way and announce a "v2" capability and listen
>>> for response in parallel. A v2 capable client can start sending "wants" or
>>> some other signal as soon as it sees the "v2" capability. Upload-pack,
>>> which was listening for responses in parallel, can interrupt its
>>> advertisements and continue with v2 protocol from here.
>>>
>>> This sounds so simple (not the implementation, of course) - I must be
>>> missing something.
>>
>> Smart HTTP is not bidirectional. The client can't cut off the server.
>
> Smart HTTP does not need it: you already posted a better solution (I'm
> refering to "&v=2").

Yes but then it diverges even further from the native bidirectional protocol.

>> Its also more complex to code the server to listen for a stop command
>> from the client at the same time the server is blasting out useless
>> references to the client.
>
> At least the server side does not seem to be that complex. See below.
> Of course, the server blasted out some refs, but I'm confident that in
> practice the client will be able to signal v2 capability after a few packets
> of advertisements. You can switch on TCP_NODELAY for the first line with
> the capabilities to ensure it goes out on the wire ASAP.
...
> +static int client_spoke(void)
> +{
> +       struct pollfd pfd;
> +       pfd.fd = 0;
> +       pfd.events = POLLIN;
> +       return poll(&pfd, 1, 0) > 0 &&
> +               (pfd.revents & (POLLIN|POLLHUP));

Except doing this in Java is harder on an arbitrary InputStream type.
I guess we really only care about basic TCP, in which case we can use
NIO to implement an emulation of poll, and SSH, where MINA SSHD
probably doesn't provide a way to see if the client has given us data
without blocking. That makes supporting v2 really hard in e.g. Gerrit
Code Review. You could argue that its improper to attempt to implement
a network protocol in a language whose standard libraries have gone
out of their way to prevent you from polling to see if data is
immediately available, but I prefer to ignore such arguments.

As it turns out we don't really have this problem with git://. Clients
can bury a v2 request in the extended headers where the host line
appears today. Its a bit tricky because of that \0 bug causing
infinite looping, but IIRC using \0\0 is safe even against ancient
servers. So git:// and http:// both have a way where the client can
ask for v2 support before the server speaks, and have it transparently
be ignored by ancient servers.


The only place we have a problem is SSH. That exec of the remote
binary is just super-strict. Its good to be paranoid, but its also
locked out any chance we have at doing the upgrade over SSH without
having to run two SSH commands in the worst case. I guess the best
approach is to try the v1 protocol by default, have the remote
advertise it supports v2, and remember this on a per-host basis in
~/.gitconfig for future requests. Users could always force a specific
preference with remote.NAME.uploadpack variable or --uploadpack
command line flag.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]