Re: clone hang prevention / timeout?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 11, 2016 at 10:49:19PM +0100, Jason Vas Dias wrote:

> It appears GIT has no way of specifying a timeout for a clone operation -
> if the server decides not to complete a get request, the clone can
> hang forever -
> is this correct ?

Yes. Git's protocol has no timeouts, though each side is generally
either writing or reading at any moment, and so an interrupted
connection should cause either EPIPE or EOF, ending the process. The
exceptions I have seen are:

 - protocol / implementation bugs that cause a true deadlock. At this
   we've fixed all known cases, but that doesn't mean there aren't bugs
   lurking.

 - the network drops out in such a way that the OS doesn't realize the
   connection is gone, and the reading side is left waiting for input
   forever

I think the TCP keepalive stuff that Eric mentioned should address the
latter, though I don't know how well it works in practice. We used to
sometimes see processes hung for days on GitHub, but it's been a long
time. I don't recall if it was pre-v1.8.5 (which introduced
SO_KEEPALIVE), or if we made some other change (we have a load-balancing
layer in front that has more aggressive timeouts).

> This appears to be what I am seeing, in a script that is attempting to do many
> successive clone operations, eg. of
> git://anongit.freedesktop.org/xorg/* , the script
> occasionally hangs in a clone - I can see with netstat + strace that the TCP
> connection is open and GIT is trying to read .
> Is there any option I can specify to get the clone to timeout, or do I manually
> have to strace the git process and send it a signal after a hang is detected?

There are periods where a git client may have to wait for a while in
read() while the other side is quiet (e.g., when the other side is badly
packed and needs to do a lot of up-front CPU work to prepare the
packfile). Since v1.8.4.2, the server side of a clone should generate
application-level keepalive packets, so that the client never sees
silence for more than ~5 seconds. The freedesktop servers appear to be
on v2.1.4, so a long read() as you're seeing probably is a real hang.

Note that pushing has a similar problem (the client may wait a long time
while the server chews on the uploaded packfile before reporting
status). There are no keepalives in that direction, though I have a
series there that I need to polish and submit.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]