Re: [Bug ?] Permanent FIN_WAIT_2 state on NFS client with bad NFS server

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Fri, 6 Oct 2017 22:38:18 +0000

On Fri, 2017-10-06 at 15:12 -0700, Manjunath Patil wrote:
> On Fri, Oct 6, 2017 at 1:51 PM, Trond Myklebust <trondmy@primarydata.
> com> wrote:
> > On Fri, 2017-10-06 at 12:13 -0700, Manjunath Patil wrote:
> > > Hi David,
> > > 
> > > On Fri, Sep 22, 2017 at 12:21 PM, Manjunath Patil
> > > <mbpatil.linux@xxxxxxxxx> wrote:
> > > > Hi David,
> > > > 
> > > > On Thu, Sep 21, 2017 at 10:05 AM, David Wysochanski <dwysocha@r
> > > > edha
> > > > t.com> wrote:
> > > > > On Wed, 2017-09-20 at 15:17 -0700, Manjunath Patil wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > With autoclose trying to close the connection, after the
> > > > > > idle
> > > > > > timeout
> > > > > > in NFSv3 mounts,
> > > > > > a bad NFS server may not send the final FIN, leading the
> > > > > > client
> > > > > > stay
> > > > > > in FIN_WAIT_2 state forever.
> > > > > > This is easily reproducible by simulating the bad server
> > > > > > behavior. I
> > > > > > used 'netstat -an | grep 2049' to observer socket state.
> > > > > > 
> > > > > 
> > > > > How long did you wait and how did you simulate the
> > > > > failure?  I am
> > > > > very
> > > > > interested in your test case.
> > > > 
> > > > I observer this in ct environment. In this case the fin_wait_2
> > > > stayed forever.
> > > > ct had to restart the node to get out.
> > > > 
> > > > We tried to simulate this behavior in Linux nfs server by
> > > > stopping
> > > > the
> > > > incoming FIN
> > > > for 2049 port inside kernel. This prevented the server from
> > > > sending
> > > > the final FIN for some time.
> > > > 
> > > > The linux server eventually sent a FIN after some delay. Though
> > > > I
> > > > am
> > > > not sure, I think this is due to
> > > > 
> > > > /* apparently the "standard" is that clients close
> > > >  * idle connections after 5 minutes, servers after
> > > >  * 6 minutes
> > > >  *   http://www.connectathon.org/talks96/nfstcp.pdf
> > > >  */
> > > > static int svc_conn_age_period = 6*60;
> > > 
> > > I tried to increase this value.
> > > After setting this value to a high value [60*60], I could see the
> > > client staying in FIN_WAIT_2 state forever.
> > > 
> > > To repeat, my test case is,
> > > 1. Take a nfs server and make it not send the FIN on 2049 port
> > > 2. Use any upstream kernel [I used 4.14-rc1] as nfs client
> > > 3. Let the mount be idle for 5 mins so that autoclose gets
> > > triggered.
> > > 4. after this, client stays in FIN_WAIT_2 state[we can observer
> > > it
> > > with netstat -an | grep 2049].
> > > 5. At this point no new NFS connection is allowed on this port.
> > > So
> > > mount is hung for application.
> > 
> > What do you mean when you say "make it not send FIN"? Are you just
> > filtering all packets with a FIN flag set? Normally, a FIN is
> > expected
> > to be ACKed by the recipient so that it can be retransmitted if
> > lost.
> 
> In my test-case I prevented TCP layer itself[by code change] from
> sending FIN packet on port 2049.
> 
> The client sends FIN, gets a ACK
> then
> Client expects final FIN, server never sends it.
> > 
> > 
> > However, even if it does not receive the FIN from the server, then
> > the
> > FIN_WAIT2 state should automatically time out after
> > /proc/sys/net/ipv4/tcp_fin_timeout seconds (see the description in
> > the
> > SO_LINGER2 socket option). Isn't this working?
> > 
> 
> I think this behavior is true only for full close of socket. The
> present issue is happening only with autoclose()
> The autoclose behavior is changed from full close to half close with
> the following commit -
> caf4ccd SUNRPC: Make xs_tcp_close() do a socket shutdown rather than
> a
> sock_release
> 
> The following commit might be related too -
> 9cbc94f SUNRPC: Remove TCP socket linger code
> 

[trondmy@leira linux]$ grep sock_shutdown net/sunrpc/xprtsock.c 
	kernel_sock_shutdown(sock, SHUT_RDWR);
		kernel_sock_shutdown(sock, SHUT_RDWR);

SHUT_RDWR is a full close AFAIK...

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@xxxxxxxxxxxxxxx
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥