On Fri, 2017-10-06 at 15:12 -0700, Manjunath Patil wrote: > On Fri, Oct 6, 2017 at 1:51 PM, Trond Myklebust <trondmy@primarydata. > com> wrote: > > On Fri, 2017-10-06 at 12:13 -0700, Manjunath Patil wrote: > > > Hi David, > > > > > > On Fri, Sep 22, 2017 at 12:21 PM, Manjunath Patil > > > <mbpatil.linux@xxxxxxxxx> wrote: > > > > Hi David, > > > > > > > > On Thu, Sep 21, 2017 at 10:05 AM, David Wysochanski <dwysocha@r > > > > edha > > > > t.com> wrote: > > > > > On Wed, 2017-09-20 at 15:17 -0700, Manjunath Patil wrote: > > > > > > Hi, > > > > > > > > > > > > With autoclose trying to close the connection, after the > > > > > > idle > > > > > > timeout > > > > > > in NFSv3 mounts, > > > > > > a bad NFS server may not send the final FIN, leading the > > > > > > client > > > > > > stay > > > > > > in FIN_WAIT_2 state forever. > > > > > > This is easily reproducible by simulating the bad server > > > > > > behavior. I > > > > > > used 'netstat -an | grep 2049' to observer socket state. > > > > > > > > > > > > > > > > How long did you wait and how did you simulate the > > > > > failure? I am > > > > > very > > > > > interested in your test case. > > > > > > > > I observer this in ct environment. In this case the fin_wait_2 > > > > stayed forever. > > > > ct had to restart the node to get out. > > > > > > > > We tried to simulate this behavior in Linux nfs server by > > > > stopping > > > > the > > > > incoming FIN > > > > for 2049 port inside kernel. This prevented the server from > > > > sending > > > > the final FIN for some time. > > > > > > > > The linux server eventually sent a FIN after some delay. Though > > > > I > > > > am > > > > not sure, I think this is due to > > > > > > > > /* apparently the "standard" is that clients close > > > > * idle connections after 5 minutes, servers after > > > > * 6 minutes > > > > * http://www.connectathon.org/talks96/nfstcp.pdf > > > > */ > > > > static int svc_conn_age_period = 6*60; > > > > > > I tried to increase this value. > > > After setting this value to a high value [60*60], I could see the > > > client staying in FIN_WAIT_2 state forever. > > > > > > To repeat, my test case is, > > > 1. Take a nfs server and make it not send the FIN on 2049 port > > > 2. Use any upstream kernel [I used 4.14-rc1] as nfs client > > > 3. Let the mount be idle for 5 mins so that autoclose gets > > > triggered. > > > 4. after this, client stays in FIN_WAIT_2 state[we can observer > > > it > > > with netstat -an | grep 2049]. > > > 5. At this point no new NFS connection is allowed on this port. > > > So > > > mount is hung for application. > > > > What do you mean when you say "make it not send FIN"? Are you just > > filtering all packets with a FIN flag set? Normally, a FIN is > > expected > > to be ACKed by the recipient so that it can be retransmitted if > > lost. > > In my test-case I prevented TCP layer itself[by code change] from > sending FIN packet on port 2049. > > The client sends FIN, gets a ACK > then > Client expects final FIN, server never sends it. > > > > > > However, even if it does not receive the FIN from the server, then > > the > > FIN_WAIT2 state should automatically time out after > > /proc/sys/net/ipv4/tcp_fin_timeout seconds (see the description in > > the > > SO_LINGER2 socket option). Isn't this working? > > > > I think this behavior is true only for full close of socket. The > present issue is happening only with autoclose() > The autoclose behavior is changed from full close to half close with > the following commit - > caf4ccd SUNRPC: Make xs_tcp_close() do a socket shutdown rather than > a > sock_release > > The following commit might be related too - > 9cbc94f SUNRPC: Remove TCP socket linger code > [trondmy@leira linux]$ grep sock_shutdown net/sunrpc/xprtsock.c kernel_sock_shutdown(sock, SHUT_RDWR); kernel_sock_shutdown(sock, SHUT_RDWR); SHUT_RDWR is a full close AFAIK... -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥