On Thu, Dec 20, 2018 at 10:03:45AM +0000, Sitsofe Wheeler wrote: > > Check the backtrace of git-clone to see why it isn't feeding more data > > (but note that it will generally have two threads -- one processing the > > data from the remote, and one wait()ing for index-pack to finish). > > > > My guess, though, is that you'll find that git-clone is simply waiting > > on another pipe: the one from ssh. > > Ok here are backtraces from another run (with SSH multiplexing disabled): OK, that's about what I expected. Here we have clone's sideband-demux thread waiting to pull more packfile data from the remote: > (gdb) thread apply all bt > > Thread 2 (Thread 0x7faafbf1c700 (LWP 36586)): > #0 0x00007faafc805384 in __libc_read (fd=fd@entry=5, > buf=buf@entry=0x7faafbf0ddec, nbytes=nbytes@entry=5) > at ../sysdeps/unix/sysv/linux/read.c:27 > #1 0x000055c8ca2f5b23 in read (__nbytes=5, __buf=0x7faafbf0ddec, __fd=5) > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > [...coming from packet_read / recv_sideband / sideband_demux...] I assume fd=5 there is a pipe connected to ssh. You could double check with "lsof" or similar, but I don't think it would ever be reading from anywhere else. And then this thread: > Thread 1 (Thread 0x7faafce3bb80 (LWP 36584)): > #0 0x00007faafc805384 in __libc_read (fd=fd@entry=7, > buf=buf@entry=0x7ffda489ef10, nbytes=nbytes@entry=46) > at ../sysdeps/unix/sysv/linux/read.c:27 > #1 0x000055c8ca2f5b23 in read (__nbytes=46, __buf=0x7ffda489ef10, __fd=7) > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > #2 xread (fd=fd@entry=7, buf=buf@entry=0x7ffda489ef10, len=len@entry=46) > at wrapper.c:260 > #3 0x000055c8ca2f5cdb in read_in_full (fd=7, buf=buf@entry=0x7ffda489ef10, > count=count@entry=46) at wrapper.c:318 > #4 0x000055c8ca26a54f in index_pack_lockfile (ip_out=<optimised out>) > at pack-write.c:297 ...is us waiting on index-pack to single completion by printing out the name of the pack's .keep file. And then this is just index-pack: > (gdb) thread apply all bt > > Thread 1 (Thread 0x7ff8a03dab80 (LWP 36587)): > #0 0x00007ff89fda434e in __libc_read (fd=fd@entry=0, > buf=buf@entry=0x5604bea43d40 <input_buffer>, nbytes=nbytes@entry=4096) > at ../sysdeps/unix/sysv/linux/read.c:27 > #1 0x00005604be77bb23 in read (__nbytes=4096, > __buf=0x5604bea43d40 <input_buffer>, __fd=0) > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44 > #2 xread (fd=0, buf=0x5604bea43d40 <input_buffer>, len=4096) at wrapper.c:260 > #3 0x00005604be5fb069 in fill (min=min@entry=1) at builtin/index-pack.c:255 ...waiting for more data. So though it goes back and forth between two processes, the sequence of blocking data is linear due to the threads: ssh -> git-clone -> index-pack -> git-clone (sideband_demux) (index_pack_lockfile) with each blocking on read() from its predecessor. So you need to find out why "ssh" is blocking. Unfortunately, short of a bug in ssh, the likely cause is either: 1. The git-upload-pack on the remote side stopped generating data for some reason. You may or may not have access on the remotehost to dig into that. It's certainly possible there's a deadlock bug between the server and client side of a Git conversation. But I'd find it extremely unlikely to find such a deadlock bug at this point in the conversation, because at this point the client side has nothing left to say to the server. The server should just be streaming out the packfile bytes and then closing the descriptor. You mentioned "Phabricator sshd scripts" running on the server. I don't know what Phabricator might be sticking in the middle of the connection, but that could be the source of the stall. 2. It's possible the network connection dropped but ssh did not notice. Maybe try turning on ssh keepalives and seeing if it eventually times out? -Peff