Re: NFS server connection hang in CLOSE_WAIT state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 21, 2012 at 01:37:20PM +0530, Rajesh Ghanekar wrote:
> Hi,
>     I have nfs server hung in CLOSE_WAIT state and the corresponding
> nfs client hang in FIN_WAIT2 state. NFS client is not retrying as NFS
> server isn't calling close() on the socket. The shutdown was initiated
> by NFS client and its now waiting for NFS server to send its FIN.
> 
>    Actually I am using old version here:
> NFS server: 2.6.16 (sles10sp3)
> NFS client: 2.6.27 (sles11)
> 
>    I see sunrpc client code missing xs_tcp_schedule_linger_timeout()  in 2.6.27:
> 
>         case TCP_FIN_WAIT1:
>                 /* The client initiated a shutdown of the socket */
>                 xprt->connect_cookie++;
>                 xprt->reestablish_timeout = 0;
>                 set_bit(XPRT_CLOSING, &xprt->state);
>                 smp_mb__before_clear_bit();
>                 clear_bit(XPRT_CONNECTED, &xprt->state);
>                 clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
>                 smp_mb__after_clear_bit();
>                 xs_tcp_schedule_linger_timeout(xprt,
> xs_tcp_fin_timeout);    << This is missing in 2.6.27
>                 break;
> 
> 
>    So its clear that if the above linger timeout had been there
> in 2.6.27 the client would have connected back again. But why
> nfs service (server) sunrpc logic (svc_recv) hasn't called close on its
> part of the socket is yet not clear to me. I would like to know if
> anyone has faced this issue and if there are any fixes which I can
> individually pick.
> 
>    If I do another mount from same NFS client and another
> IP of same NFS server, things starts fine. So to make the
> original IP of same NFS server work fine, restart of nfs server
> will do. But I would like to know why the hang on the transport
> from NFS server side.
> 
>   Sorry in advance for the noise if expectation is to test with
> recent kernels and report accordingly. I can't test with new
> kernels.

Yeah, 2.6.16 is pretty old from the upstream point of view.  If you
can't retry with an upstream kernel then it should be reported to the
SLES folks.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux