On Wed, 2018-12-12 at 16:56 +0000, Trond Myklebust wrote: > On Wed, 2018-12-12 at 08:51 -0500, Dave Wysochanski wrote: > > Commit 9b30889c548a changed the handling of TCP_CLOSE inside > > xs_tcp_state_change. Prior to this change, the XPRT_CONNECTED bit > > was cleared unconditionally inside xprt_disconnect_done, similar > > to the handling of TCP_CLOSE_WAIT. After the change the clearing > > of XPRT_CONNECTED depends on successfully queueing a work based > > xprt_autoclose which depends on XPRT_LOCKED and may not happen. > > This is significant in the case of an unexpected RST from the > > server, as the client will only see xs_tcp_state_change called with > > sk_state == TCP_CLOSE. Restore the unconditional clear_bit on > > XPRT_CONNECTED while handling TCP_CLOSE and make it consistent > > with handling TCP_CLOSE_WAIT. > > > > Signed-off-by: Dave Wysochanski <dwysocha@xxxxxxxxxx> > > --- > > net/sunrpc/xprtsock.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > > index 8a5e823e0b33..b9789036051d 100644 > > --- a/net/sunrpc/xprtsock.c > > +++ b/net/sunrpc/xprtsock.c > > @@ -1492,6 +1492,7 @@ static void xs_tcp_state_change(struct sock > > *sk) > > if (sk->sk_err) > > xprt_wake_pending_tasks(xprt, -sk- > > >sk_err); > > /* Trigger the socket release */ > > + clear_bit(XPRT_CONNECTED, &xprt->state); > > xs_tcp_force_close(xprt); > > } > > out: > > Hi Dave, > > This isn't needed for 4.20 or newer because call_transmit() will now > always call xprt_end_transmit(). I suggest that a stable fix do > something similar (perhaps conditional on the error returned by > xprt_transmit()?). > Can you explain the handling in xs_tcp_state_change - why XPRT_CONNECTED would need to remain set longer in the case of handling TCP_CLOSE case rather than TCP_CLOSE_WAIT? It seems like if we get an RST we'll go directly to TCP_CLOSE and why would the XPRT_CONNECTED bit getting cleared need to be delayed in that case? I will look into the approach you mention though at the moment I do not see how it helps because the underlying issue seems to be clearing of the XPRT_CONNECTED bit. Thanks.