Re: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Mar 8, 2023, at 14:01, Siddharth Kawar <Siddharth.Kawar@xxxxxxxxxxxxx> wrote:
> 
> [You don't often get email from siddharth.kawar@xxxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> Subject: [PATCH] SUNRPC: fix shutdown of NFS TCP client socket
> 
> NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
> reconnecting using the same local TCP port. Unique NFS operations are
> identified by the per-TCP connection set of XIDs. This prevents file
> corruption when non-idempotent NFS operations are retried.
> 
> Currently, NFS client TCP connections are using different local TCP ports
> when reconnecting to NFS servers.
> 
> After an NFS server initiates shutdown of the TCP connection, the NFS
> client's TCP socket is set to NULL after the socket state has reached
> TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
> the same local port but fails with EADDRNOTAVAIL (99). This forces the
> socket to use a different local TCP port to reconnect to the remote NFS
> server.
> 
> State Transition and Events:
> TCP_CLOSING(8)

No: TCP state == 8 is TCP_CLOSE_WAIT

> TCP_LAST_ACK(9)
> connect(fail EADDRNOTAVAIL(99))
> TCP_CLOSE(7)
> bind on new port
> connect success
> 
> dmesg excerpts showing reconnect switching from local port of 688 to 1014:
> [590701.200229] NFS: mkdir(0:61/560857152), testQ
> [590701.200231] NFS call  mkdir testQ
> [590701.200259] RPC:       xs_tcp_send_request(224) = 0
> [590751.883111] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883146] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> [590751.883160] RPC:       xs_data_ready...
> [590751.883232] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883235] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883238] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883239] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883283] RPC:       xs_connect scheduled xprt 0000000051f4e000
> [590751.883314] RPC:       xs_bind 0.0.0.0:688: ok (0)
> [590751.883321] RPC:       worker connecting xprt 0000000051f4e000 via tcp
>                           to 10.101.1.30 (port 2049)
> [590751.883330] RPC:       0000000051f4e000 connect status 99 connected 0
>                           sock state 7
> [590751.883342] RPC:       xs_tcp_state_change client 0000000051f4e000...
> [590751.883343] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [590751.883356] RPC:       xs_connect scheduled xprt 0000000051f4e000
> [590751.883383] RPC:       xs_bind 0.0.0.0:1014: ok (0)
> [590751.883388] RPC:       worker connecting xprt 0000000051f4e000 via tcp
>                           to 10.101.1.30 (port 2049)
> [590751.883420] RPC:       0000000051f4e000 connect status 115 connected 0
>                           sock state 2
> ...
> 
> 
> State Transition and Events with patch applied:
> TCP_CLOSING(8)
> TCP_LAST_ACK(9)
> TCP_CLOSE(7)
> connect(reuse of port succeeds)
> 
> dmesg excerpts showing reconnect on same port (936):
> [  257.139935] NFS: mkdir(0:59/560857152), testQ
> [  257.139937] NFS call  mkdir testQ
> ...
> [  307.822702] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
> [  307.822714] RPC:       xs_data_ready...
> [  307.822817] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.822821] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.822825] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.822826] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.823606] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.823609] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.823629] RPC:       xs_tcp_state_change client 00000000ce702f14...
> [  307.823632] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  307.823676] RPC:       xs_connect scheduled xprt 00000000ce702f14
> [  307.823704] RPC:       xs_bind 0.0.0.0:936: ok (0)
> [  307.823709] RPC:       worker connecting xprt 00000000ce702f14 via tcp
>                          to 10.101.1.30 (port 2049)
> [  307.823748] RPC:       00000000ce702f14 connect status 115 connected 0
>                          sock state 2
> ...
> [  314.916193] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [  314.916251] RPC:       xs_connect scheduled xprt 00000000ce702f14
> [  314.916282] RPC:       xs_bind 0.0.0.0:936: ok (0)
> [  314.916292] RPC:       worker connecting xprt 00000000ce702f14 via tcp
>                          to 10.101.1.30 (port 2049)
> [  314.916342] RPC:       00000000ce702f14 connect status 115 connected 0
>                          sock state 2
> 
> Fixes: 7c81e6a9d75b (SUNRPC: Tweak TCP socket shutdown in the RPC client)
> Signed-off-by: Siddharth Rajendra Kawar <sikawar@xxxxxxxxxxxxx>
> ---
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index d8ee06a9650a..e55e380bc10c 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2094,6 +2094,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
>                break;
>        case TCP_ESTABLISHED:
>        case TCP_CLOSE_WAIT:
> +       case TCP_LAST_ACK:
>                kernel_sock_shutdown(sock, SHUT_RDWR);
>                trace_rpc_socket_shutdown(xprt, sock);
>                break;

The only way to transition from TCP_CLOSE_WAIT to TCP_LAST_ACK is by closing the socket ( see https://www.ibm.com/docs/en/zos/2.1.0?topic=SSLTBW_2.1.0/com.ibm.zos.v2r1.halu101/constatus.html ), so closing it while in TCP_LAST_ACK is redundant.


_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux