Hi there, While investigating a hung job in our CI system today, I think I found a deadlock in git-remote-http Git version: 2.9.3 Linux (amd64) kernel 4.9.0 Excerpt from the process list: jenkins 27316 0.0 0.0 18508 6024 ? S 19:30 0:00 | \_ git -C ../../../arista fetch --unshallow jenkins 27317 0.0 0.0 169608 10916 ? S 19:30 0:00 | \_ git-remote-http origin http://gerrit/arista jenkins 27319 0.0 0.0 24160 8260 ? S 19:30 0:00 | \_ git fetch-pack --stateless-rpc --stdin --lock-pack --include-tag --thin --no-progress --depth=2147483647 http://gerrit/arista/ Here PID 27319 (git fetch-pack) is stuck reading on stdin, while its parent, PID 27317 (git-remote-http) is stuck reading on its child’s stdout. Nothing has moved for like 2h, it’s deadlocked. > strace -fp 27319 strace: Process 27319 attached read(0, Here FD 0 is a pipe: ~ @8a33a534e2f7> lsof -np 27319 | grep 0r git 27319 jenkins 0r FIFO 0,10 0t0 354519158 pipe The writing end of which is owned by the parent process: ~ @8a33a534e2f7> lsof -n 2>/dev/null | fgrep 354519158 git-remot 27317 jenkins 4w FIFO 0,10 0t0 354519158 pipe git 27319 jenkins 0r FIFO 0,10 0t0 354519158 pipe And the parent process (git-remote-http) is stuck reading from another FD: > strace -fp 27317 strace: Process 27317 attached read(5, And here FD 5 is another pipe: ~ @8a33a534e2f7> lsof -np 27317 | grep 5r git-remot 27317 jenkins 5r FIFO 0,10 0t0 354519159 pipe Which is the child’s stdout: > lsof -n 2>/dev/null | fgrep 354519159 git-remot 27317 jenkins 5r FIFO 0,10 0t0 354519159 pipe git 27319 jenkins 1w FIFO 0,10 0t0 354519159 pipe Hence the deadlock. Stack trace in git-remote-http: (gdb) bt #0 0x00007f04f1e1363d in read () from target:/lib64/libpthread.so.0 #1 0x0000562417472d73 in xread () #2 0x0000562417472f2b in read_in_full () #3 0x0000562417438a6e in get_packet_data () #4 0x0000562417439129 in packet_read () #5 0x00005624174245e0 in rpc_service () #6 0x0000562417424f10 in fetch_git () #7 0x00005624174233fd in main () Stack trace in git fetch-pack: (gdb) bt #0 0x00007fb3ab478620 in __read_nocancel () from target:/lib64/libpthread.so.0 #1 0x000055f688827283 in xread () #2 0x000055f68882743b in read_in_full () #3 0x000055f6887ce35e in get_packet_data () #4 0x000055f6887cea19 in packet_read () #5 0x000055f6887ceb90 in packet_read_line () #6 0x000055f68879dd05 in get_ack () #7 0x000055f68879f6b4 in fetch_pack () #8 0x000055f688710619 in cmd_fetch_pack () #9 0x000055f6886dff7b in handle_builtin () #10 0x000055f6886df026 in main () I looked at the diff between v2.9.3 and HEAD on fetch-pack.c and remote-curl.c and didn’t see anything noteworthy in that area of the code, so I presume the bug is still there in master. -- Benoit "tsuna" Sigoure