How does this behave when you enable SSH keepalive? >From Phone ________________________________ From: openssh-unix-dev <openssh-unix-dev-bounces+herbie.robinson=stratus.com@xxxxxxxxxxx> on behalf of Damien Miller <djm@xxxxxxxxxxx> Sent: Monday, September 19, 2022 4:55:21 AM To: Corey Hickey <bugfood-ml@xxxxxxxxxx> Cc: openssh-unix-dev@xxxxxxxxxxx <openssh-unix-dev@xxxxxxxxxxx> Subject: [EXTERNAL] Re: TCP Forwarding hangs when TCP service is unresponsive, even when TCP client exits [EXTERNAL SENDER: This email originated from outside of Stratus Technologies. Do not click links or open attachments unless you recognize the sender and know the content is safe.] On Fri, 16 Sep 2022, Corey Hickey wrote: First, thanks for the detailed investigation and for reproducing this with git HEAD. > When a TCP client does not receive a response from a service, the client > can opt to time out and exit. If the connection is passed through an SSH > tunnel, however, certain circumstances can make the SSH tunnel hang > indefinitely. This affects both remote port forwarding (-R) and local > (-L). > > This report is for current openssh-portable git running on Linux. For > released versions, at least OpenSSH 8.9p1 is affected as well, though I > did not test other versions. > > --- Remote Forwarding --- > To reproduce, first start up a TCP service which will accept connections > but fail to respond thereafter. One way to do this is by stopping netcat > shortly after startup (options are for the OpenBSD version of netcat). > > $ nc -k -l 127.0.0.1<http://127.0.0.1> 9999 > /dev/null & sleep 1 ; kill -STOP %% > > Next use a different terminal to start an SSH service. Debug options are > not required but are helpful for diagnosis. This example uses port 2222 > in order to avoid conflict with the system SSH service. The service can > be run on the same local host or on a remote host. > > $ sudo /usr/sbin/sshd -d -d -f /dev/null -o Port=2222 -o \ > HostKey=/etc/ssh/ssh_host_ed25519_key > > Next use a different terminal to start an SSH client which runs a TCP > client over a forwarded connection. This example uses wget, but any TCP > client that can be configured to time out should behave the same. > > $ ssh localhost -p 2222 -R 8888:127.0.0.1:9999<http://127.0.0.1:9999> -v -v wget \ > --timeout=1 --tries=1 http://127.0.0.1:8888<http://127.0.0.1:8888> > > > The observed results are that the TCP client (wget) exits, but the SSH > client hangs until either manually killed or the TCP service (netcat) is > resumed. > > In detail, the sequence of events is as follows: > 1. The SSH client connects to the server; the client and server set up > channels as usual, including one for the port-forwarding. The SSH client > starts a TCP client on the SSH server. > 2. The TCP client connects to the SSH server's listening socket, and the > SSH client connects to the TCP service's listening socket. The 3-way > handshakes complete, but when the TCP client sends data to the service, > the service never responds. > 3. The TCP client times out, closes its socket for the connection to the > SSH server, and exits. The SSH server sends the client an EOF on the > forwarded channel, but does not close its own socket for the connection > to the now-exited TCP client; this socket remains in CLOSE_WAIT. > 4. The SSH client receives the EOF and drains the channel output, but > continues to wait for data on the channel input. The SSH server won't > close the channel until the client does, and the client won't close the > channel until it receives data (or an error) from the channel. This is kind of a tricky case, because for some cases it's AFAIK impossible for the client to discern between a TCP server that a) will never respond from b) hasn't responded *yet*. The solution that you proposed is unfortunately not without side effects - I think it changes the behaviour of half-closed TCP connection in a way that might lose data. > [djm@lll ~]$ wget --timeout=1 --tries=1 http://127.0.0.1:8888--2022-09-19 18:16:17-- http://127.0.0.1:8888/<http://127.0.0.1:8888/> > Connecting to 127.0.0.1:8888<http://127.0.0.1:8888>... debug3: receive packet: type 90 > debug1: client_input_channel_open: ctype forwarded-tcpip rchan 3 win 2097152 max 32768 > debug1: client_request_forwarded_tcpip: listen localhost port 8888, originator 127.0.0.1<http://127.0.0.1> port 35148 > debug2: fd 7 setting O_NONBLOCK > debug2: fd 7 setting TCP_NODELAY > debug1: connect_next: host 127.0.0.1<http://127.0.0.1> ([127.0.0.1<http://127.0.0.1>]:9999) in progress, fd=7 > debug3: fd 7 is O_NONBLOCK > debug3: fd 7 is O_NONBLOCK > debug1: channel 1: new [127.0.0.1<http://127.0.0.1>] > debug1: confirm forwarded-tcpip > debug3: channel 1: waiting for connection > debug3: channel 1: waiting for connection > connected. > HTTP request sent, awaiting response... debug3: channel 1: waiting for connection > debug3: channel 1: waiting for connection > Read error (Connection timed out) in headers. > Giving up. > > debug3: channel 1: waiting for connection > debug3: channel 1: waiting for connection Is this what you see too? IMO the root problem here is that channels in state SSH_CHANNEL_CONNECTING have no timeout unless there system's TCP stack implements one. Maybe OpenSSH should implement something conservative here. I do notice some different behaviour between Linux (above) and OpenBSD. On OpenBSD the connection is accepted but obviously does not pass any data (of course). This is harder to fix without the side effects I mentioned above, e.g. consider a TCP client program that connects to a forwarded socket, sends a message and exits without waiting for a reply. I think setting c->force_drain in this case could cause the message to be lost (though I'm not 100% sure). -d _______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev<https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev> _______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev