On Tue, 12 Sep 2017, Ken Chang wrote: > Everything seems to work for a few days, but then ssh starts to hang, and > we start seeing several hundred ssh processes all trying to send their > message but cannot. When i try to run ssh by hand, this is what i get: > > $ ssh -vvv boss@ui1 > OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013 > debug1: Reading configuration data /var/lib/worker/.ssh/config > debug1: /var/lib/worker/.ssh/config line 1: Applying options for * > debug1: Reading configuration data /etc/ssh/ssh_config > debug1: /etc/ssh/ssh_config line 56: Applying options for * > debug1: auto-mux: Trying existing master > > And it hangs at that point indefinitely until Ctrl-C. > > At this point in time, we do see the ssh mux process still running: > > $ ps -eo pid,user,args | awk '$2=="worker" && $3=="ssh:" && $5=="[mux]" > {print}' > 29305 worker ssh: /var/lib/worker/.ssh/cm-boss@ui1:22 [mux] > > I tried to attach strace to the ssh mux process, and this is what i see > when the problem is happening: A debug log from the mux process from around this point would be much more useful. Is there any chance you could catch one? > accept(4, 0x7ffe26b34360, [128]) = -1 EMFILE (Too many open files) Indicates that you're running out of file descriptors. Have you increased MaxSessions on the server? Do you have a PAM module, login.conf or similar reducing a ulimit? Otherwise, there might be a fd leak in the mux code. You should try a recent version, 6.6.1 is over three years old... -d _______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev