Re: Random NFS client lockups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> writes:

>> Mar 16 05:02:40.051969: RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
>> Mar 16 05:02:40.052067: RPC:       xs_close xprt 0000000022aecad1
>> Mar 16 05:02:40.052189: RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
>> Mar 16 05:02:40.052243: RPC:       xs_error_report client 0000000022aecad1, error=32...
>> Mar 16 05:02:40.052367: RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
>> Mar 16 05:02:40.052503: RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
>> Mar 16 05:02:40.053201: RPC:       xs_connect scheduled xprt 0000000022aecad1
>> Mar 16 05:02:40.055886: RPC:       xs_bind 0000:0000:0000:0000:0000:0000:0000:0000:875: ok (0)
>> __A__  05:02:40.055947: RPC:       worker connecting xprt 0000000022aecad1 via tcp to XXXXX:2001:1022:: (port 2049)
>> Mar 16 05:02:40.055995: RPC:       0000000022aecad1 connect status 115 connected 0 sock state 2
>
> Socket is in TCP state 2 == SYN_SENT
> So the client requested to connect to the server

server closed the connection (state 8, CLOSE_WAIT), client cleaned up
correctly and reconnected.


>> Mar 16 05:07:28.326605: RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
>
> Socket is now in TCP state 8 == TCP_CLOSE_WAIT...
>
> That means the server sent a FIN to the client to request that the
> connection be closed.

yes; the same situation like above


>> Mar 16 05:07:28.326679: RPC:       xs_connect scheduled xprt 0000000022aecad1
>> __B__  05:07:28.326978: RPC:       worker connecting xprt 0000000022aecad1 via tcp to XXXXX:2001:1022:: (port 2049)
>> Mar 16 05:07:28.327050: RPC:       0000000022aecad1 connect status 0 connected 0 sock state 8
>> __C__  05:07:28.327113: RPC:       xs_close xprt 0000000022aecad1
>
> Client closes the socket, which is still in TCP_CLOSE_WAIT

the 'xs_close' is very likely a reaction to the state change reported
above and should happen before 'xs_connect'.


> Basically, what the above means is that your server is initiating the
> close of the connection, not the client.

yes; but the client should reconnect (and does it in most cases).
Sometimes there seems to be a race which prevents the reconnect and
brings the client in a broken state.



Enrico




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux