Re: [Bug ?] Permanent FIN_WAIT_2 state on NFS client with bad NFS server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2017-10-06 at 12:13 -0700, Manjunath Patil wrote:
> Hi David,
> 
> On Fri, Sep 22, 2017 at 12:21 PM, Manjunath Patil
> <mbpatil.linux@xxxxxxxxx> wrote:
> > Hi David,
> > 
> > On Thu, Sep 21, 2017 at 10:05 AM, David Wysochanski <dwysocha@redha
> > t.com> wrote:
> > > On Wed, 2017-09-20 at 15:17 -0700, Manjunath Patil wrote:
> > > > Hi,
> > > > 
> > > > With autoclose trying to close the connection, after the idle
> > > > timeout
> > > > in NFSv3 mounts,
> > > > a bad NFS server may not send the final FIN, leading the client
> > > > stay
> > > > in FIN_WAIT_2 state forever.
> > > > This is easily reproducible by simulating the bad server
> > > > behavior. I
> > > > used 'netstat -an | grep 2049' to observer socket state.
> > > > 
> > > 
> > > How long did you wait and how did you simulate the failure?  I am
> > > very
> > > interested in your test case.
> > 
> > I observer this in ct environment. In this case the fin_wait_2
> > stayed forever.
> > ct had to restart the node to get out.
> > 
> > We tried to simulate this behavior in Linux nfs server by stopping
> > the
> > incoming FIN
> > for 2049 port inside kernel. This prevented the server from sending
> > the final FIN for some time.
> > 
> > The linux server eventually sent a FIN after some delay. Though I
> > am
> > not sure, I think this is due to
> > 
> > /* apparently the "standard" is that clients close
> >  * idle connections after 5 minutes, servers after
> >  * 6 minutes
> >  *   http://www.connectathon.org/talks96/nfstcp.pdf
> >  */
> > static int svc_conn_age_period = 6*60;
> 
> I tried to increase this value.
> After setting this value to a high value [60*60], I could see the
> client staying in FIN_WAIT_2 state forever.
> 
> To repeat, my test case is,
> 1. Take a nfs server and make it not send the FIN on 2049 port
> 2. Use any upstream kernel [I used 4.14-rc1] as nfs client
> 3. Let the mount be idle for 5 mins so that autoclose gets triggered.
> 4. after this, client stays in FIN_WAIT_2 state[we can observer it
> with netstat -an | grep 2049].
> 5. At this point no new NFS connection is allowed on this port. So
> mount is hung for application.

What do you mean when you say "make it not send FIN"? Are you just
filtering all packets with a FIN flag set? Normally, a FIN is expected
to be ACKed by the recipient so that it can be retransmitted if lost.


However, even if it does not receive the FIN from the server, then the
FIN_WAIT2 state should automatically time out after
/proc/sys/net/ipv4/tcp_fin_timeout seconds (see the description in the
SO_LINGER2 socket option). Isn't this working?

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@xxxxxxxxxxxxxxx
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux