On Wed, 19 Dec 2018 at 22:59, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote: > > While using trying to use git to clone a remote repository git > index-pack occasionally goes on to hang: On Thu, 20 Dec 2018 at 16:48, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote: > > On Thu, 20 Dec 2018 at 15:11, Jeff King <peff@xxxxxxxx> wrote: > > [snip] > > > > with each blocking on read() from its predecessor. So you need to find > > out why "ssh" is blocking. Unfortunately, short of a bug in ssh, the > > likely cause is either: > > > > 1. The git-upload-pack on the remote side stopped generating data for > > some reason. You may or may not have access on the remotehost to > > dig into that. > > > > It's certainly possible there's a deadlock bug between the server > > and client side of a Git conversation. But I'd find it extremely > > unlikely to find such a deadlock bug at this point in the > > conversation, because at this point the client side has nothing > > left to say to the server. The server should just be streaming out > > the packfile bytes and then closing the descriptor. > > I think it's highly unlikely too given how many good runs we generally have. > > > You mentioned "Phabricator sshd scripts" running on the server. > > I don't know what Phabricator might be sticking in the middle of > > the connection, but that could be the source of the stall. > > I think you're right. I set up a seperate sshd on a different port on > the same machine where there were no Phabricator callouts and the > problem never manifested... Just to finally follow up - this was confirmed to be a bug in the Phabricator server side git+ssh wrapper and the issue exists in at least Phabricator versions between 2016 Week 28 - 2019 Week 3. The Phabricator ssh-exec PHP script was doing a select() that would very occasionally fail with errno set to EINTR and then going on to cause subsequent git data to not be sent to the git client. It's unclear why this started happening more frequently for us but we had recently changed to a faster/more cores/larger AWS instance type and we also switched to a later Ubuntu 14.04 kernel... See https://discourse.phabricator-community.org/t/sporadic-git-cloning-hang-over-ssh/2233 for a longer description of the issue that was reported to the Phabricator developers. That link contains a patch by myself that resolves the issue, a link to an upstream Phabricator task made by the Phabricator devs and a link to an upstream Phabricator diff that works around the issue in a looser manner than my final patch. Thanks for your rapid reply and guidance Jeff! -- Sitsofe | http://sucs.org/~sits/