Re: Random NFS client lockups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2022-03-16 at 21:55 +0100, Enrico Scholz wrote:
> [You don't often get email from enrico.scholz@xxxxxxxxxxxxxxxxx.
> Learn why this is important at
> http://aka.ms/LearnAboutSenderIdentification.]
> 
> Enrico Scholz <enrico.scholz@xxxxxxxxxxxxxxxxx> writes:
> 
> > I am experiencing random NFS client lockups after 1-2 days.
> > ...
> > The full log is available at https://pastebin.pl/view/7d0b345b
> 
> Filtering only RPC messages[1] shows
> 
> -----
> Mar 16 04:48:58.345357: RPC:       worker connecting xprt
> 0000000022aecad1 via tcp to XXXXX:2001:1022:: (port 2049)
> Mar 16 04:48:58.345487: RPC:       0000000022aecad1 connect status
> 115 connected 0 sock state 2
> Mar 16 04:48:58.345705: RPC:       state 1 conn 0 dead 0 zapped 1
> sk_shutdown 0
> Mar 16 05:02:40.037783: RPC:       state 8 conn 1 dead 0 zapped 1
> sk_shutdown 1
> Mar 16 05:02:40.037921: RPC:       xs_close xprt 0000000022aecad1
> Mar 16 05:02:40.038076: RPC:       state 9 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.038211: RPC:       state 7 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.038293: RPC:       xs_error_report client
> 0000000022aecad1, error=104...
> Mar 16 05:02:40.038449: RPC:       state 7 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.039134: RPC:       xs_connect scheduled xprt
> 0000000022aecad1
> Mar 16 05:02:40.042991: RPC:       xs_bind
> 0000:0000:0000:0000:0000:0000:0000:0000:883: ok (0)
> Mar 16 05:02:40.043072: RPC:       worker connecting xprt
> 0000000022aecad1 via tcp to XXXXX:2001:1022:: (port 2049)
> Mar 16 05:02:40.043159: RPC:       0000000022aecad1 connect status
> 115 connected 0 sock state 2
> Mar 16 05:02:40.043308: RPC:       state 1 conn 0 dead 0 zapped 1
> sk_shutdown 0
> Mar 16 05:02:40.045883: RPC:       state 8 conn 1 dead 0 zapped 1
> sk_shutdown 1
> Mar 16 05:02:40.046163: RPC:       xs_close xprt 0000000022aecad1
> Mar 16 05:02:40.046410: RPC:       state 9 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.046625: RPC:       state 9 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.047428: RPC:       xs_connect scheduled xprt
> 0000000022aecad1
> Mar 16 05:02:40.050387: RPC:       xs_bind
> 0000:0000:0000:0000:0000:0000:0000:0000:822: ok (0)
> Mar 16 05:02:40.050614: RPC:       worker connecting xprt
> 0000000022aecad1 via tcp to XXXXX:2001:1022:: (port 2049)
> Mar 16 05:02:40.050662: RPC:       0000000022aecad1 connect status
> 115 connected 0 sock state 2
> Mar 16 05:02:40.050788: RPC:       state 1 conn 0 dead 0 zapped 1
> sk_shutdown 0
> Mar 16 05:02:40.051969: RPC:       state 8 conn 1 dead 0 zapped 1
> sk_shutdown 1
> Mar 16 05:02:40.052067: RPC:       xs_close xprt 0000000022aecad1
> Mar 16 05:02:40.052189: RPC:       state 7 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.052243: RPC:       xs_error_report client
> 0000000022aecad1, error=32...
> Mar 16 05:02:40.052367: RPC:       state 7 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.052503: RPC:       state 7 conn 0 dead 0 zapped 1
> sk_shutdown 3
> Mar 16 05:02:40.053201: RPC:       xs_connect scheduled xprt
> 0000000022aecad1
> Mar 16 05:02:40.055886: RPC:       xs_bind
> 0000:0000:0000:0000:0000:0000:0000:0000:875: ok (0)
> __A__  05:02:40.055947: RPC:       worker connecting xprt
> 0000000022aecad1 via tcp to XXXXX:2001:1022:: (port 2049)
> Mar 16 05:02:40.055995: RPC:       0000000022aecad1 connect status
> 115 connected 0 sock state 2

Socket is in TCP state 2 == SYN_SENT
So the client requested to connect to the server

> Mar 16 05:02:40.056125: RPC:       state 1 conn 0 dead 0 zapped 1
> sk_shutdown 0

Socket is in TCP state 1 == TCP_ESTABLISHED
The connection to the server was accepted, and the connection is
established.

> Mar 16 05:07:28.326605: RPC:       state 8 conn 1 dead 0 zapped 1
> sk_shutdown 1

Socket is now in TCP state 8 == TCP_CLOSE_WAIT...

That means the server sent a FIN to the client to request that the
connection be closed.

(see
https://users.cs.northwestern.edu/~agupta/cs340/project2/TCPIP_State_Transition_Diagram.pdf
for more information on the TCP state diagram)

> Mar 16 05:07:28.326679: RPC:       xs_connect scheduled xprt
> 0000000022aecad1
> __B__  05:07:28.326978: RPC:       worker connecting xprt
> 0000000022aecad1 via tcp to XXXXX:2001:1022:: (port 2049)
> Mar 16 05:07:28.327050: RPC:       0000000022aecad1 connect status 0
> connected 0 sock state 8
> __C__  05:07:28.327113: RPC:       xs_close xprt 0000000022aecad1

Client closes the socket, which is still in TCP_CLOSE_WAIT

> Mar 16 05:07:28.327229: RPC:       state 9 conn 0 dead 0 zapped 1
> sk_shutdown 3

...and then transitions to TCP state 9 == TCP_LAST_ACK

> Mar 16 05:07:28.327446: RPC:       state 9 conn 0 dead 0 zapped 1
> sk_shutdown 3
> -----
> 

For more information on TCP states, see
https://users.cs.northwestern.edu/~agupta/cs340/project2/TCPIP_State_Transition_Diagram.pdf

Basically, what the above means is that your server is initiating the
close of the connection, not the client.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux