> [Some people who received this message don't often get email from trondmy@xxxxxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > > On Mar 8, 2023, at 14:01, Siddharth Kawar <Siddharth.Kawar@xxxxxxxxxxxxx> wrote: > > > > [You don't often get email from siddharth.kawar@xxxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > > > Subject: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket > > > > NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients > > reconnecting using the same local TCP port. Unique NFS operations are > > identified by the per-TCP connection set of XIDs. This prevents file > > corruption when non-idempotent NFS operations are retried. > > > > Currently, NFS client TCP connections are using different local TCP ports > > when reconnecting to NFS servers. > > > > After an NFS server initiates shutdown of the TCP connection, the NFS > > client's TCP socket is set to NULL after the socket state has reached > > TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse > > the same local port but fails with EADDRNOTAVAIL (99). This forces the > > socket to use a different local TCP port to reconnect to the remote NFS > > server. > > > > State Transition and Events: > > TCP_CLOSING(8) > > No: TCP state == 8 is TCP_CLOSE_WAIT You are right. Will fix in v2. > > > TCP_LAST_ACK(9) > > connect(fail EADDRNOTAVAIL(99)) > > TCP_CLOSE(7) > > bind on new port > > connect success > > > > dmesg excerpts showing reconnect switching from local port of 688 to 1014: > > [590701.200229] NFS: mkdir(0:61/560857152), testQ > > [590701.200231] NFS call mkdir testQ > > [590701.200259] RPC: xs_tcp_send_request(224) = 0 > > [590751.883111] RPC: xs_tcp_state_change client 0000000051f4e000... > > [590751.883146] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1 > > [590751.883160] RPC: xs_data_ready... > > [590751.883232] RPC: xs_tcp_state_change client 0000000051f4e000... > > [590751.883235] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [590751.883238] RPC: xs_tcp_state_change client 0000000051f4e000... > > [590751.883239] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [590751.883283] RPC: xs_connect scheduled xprt 0000000051f4e000 > > [590751.883314] RPC: xs_bind 0.0.0.0:688: ok (0) > > [590751.883321] RPC: worker connecting xprt 0000000051f4e000 via tcp > > to 10.101.1.30 (port 2049) > > [590751.883330] RPC: 0000000051f4e000 connect status 99 connected 0 > > sock state 7 > > [590751.883342] RPC: xs_tcp_state_change client 0000000051f4e000... > > [590751.883343] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [590751.883356] RPC: xs_connect scheduled xprt 0000000051f4e000 > > [590751.883383] RPC: xs_bind 0.0.0.0:1014: ok (0) > > [590751.883388] RPC: worker connecting xprt 0000000051f4e000 via tcp > > to 10.101.1.30 (port 2049) > > [590751.883420] RPC: 0000000051f4e000 connect status 115 connected 0 > > sock state 2 > > ... > > > > > > State Transition and Events with patch applied: > > TCP_CLOSING(8) > > TCP_LAST_ACK(9) > > TCP_CLOSE(7) > > connect(reuse of port succeeds) > > > > dmesg excerpts showing reconnect on same port (936): > > [ 257.139935] NFS: mkdir(0:59/560857152), testQ > > [ 257.139937] NFS call mkdir testQ > > ... > > [ 307.822702] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1 > > [ 307.822714] RPC: xs_data_ready... > > [ 307.822817] RPC: xs_tcp_state_change client 00000000ce702f14... > > [ 307.822821] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [ 307.822825] RPC: xs_tcp_state_change client 00000000ce702f14... > > [ 307.822826] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [ 307.823606] RPC: xs_tcp_state_change client 00000000ce702f14... > > [ 307.823609] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [ 307.823629] RPC: xs_tcp_state_change client 00000000ce702f14... > > [ 307.823632] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [ 307.823676] RPC: xs_connect scheduled xprt 00000000ce702f14 > > [ 307.823704] RPC: xs_bind 0.0.0.0:936: ok (0) > > [ 307.823709] RPC: worker connecting xprt 00000000ce702f14 via tcp > > to 10.101.1.30 (port 2049) > > [ 307.823748] RPC: 00000000ce702f14 connect status 115 connected 0 > > sock state 2 > > ... > > [ 314.916193] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 > > [ 314.916251] RPC: xs_connect scheduled xprt 00000000ce702f14 > > [ 314.916282] RPC: xs_bind 0.0.0.0:936: ok (0) > > [ 314.916292] RPC: worker connecting xprt 00000000ce702f14 via tcp > > to 10.101.1.30 (port 2049) > > [ 314.916342] RPC: 00000000ce702f14 connect status 115 connected 0 > > sock state 2 > > > > Fixes: 7c81e6a9d75b (SUNRPC: Tweak TCP socket shutdown in the RPC client) > > Signed-off-by: Siddharth Rajendra Kawar <sikawar@xxxxxxxxxxxxx> > > --- > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > > index d8ee06a9650a..e55e380bc10c 100644 > > --- a/net/sunrpc/xprtsock.c > > +++ b/net/sunrpc/xprtsock.c > > @@ -2094,6 +2094,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt) > > break; > > case TCP_ESTABLISHED: > > case TCP_CLOSE_WAIT: > > + case TCP_LAST_ACK: > > kernel_sock_shutdown(sock, SHUT_RDWR); > > trace_rpc_socket_shutdown(xprt, sock); > > break; > > The only way to transition from TCP_CLOSE_WAIT to TCP_LAST_ACK is by closing the socket ( see https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdocs%2Fen%2Fzos%2F2.1.0%3Ftopic%3DSSLTBW_2.1.0%2Fcom.ibm.zos.v2r1.halu101%2Fconstatus.html&data=05%7C01%7CSiddharth.Kawar%40microsoft.com%7C96b3c75820714bd3753208db24c266f8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638144189616407625%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uuXzgjaPk7YpnY6nO0NgNBPeq5HaA5%2BNLyrNFe%2BcjE8%3D&reserved=0 ), so closing it while in TCP_LAST_ACK is redundant. You are right. The TCP_LAST_ACK state should not cause the socket to be closed again. Instead, as in the case of TCP_FIN_WAIT1 and TCP_FIN_WAIT2, the TCP_LAST_ACK state should not be calling xs_reset_transport. Will fix in v2. > > > _________________________________ > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx