Re: Question: How to customize retransmission timeout of unacknowledged NFS v3 TCP packet?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This problem is duplicated with
https://lore.kernel.org/linux-nfs/YQBPR01MB10724B629B69F7969AC6BDF9586C89@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

According to the discussion, the patch is submitted to fix the timeout
used in xprt_socket, which is still on the way. On the other hand,
"tcp_retries2" doesn't work in control the transmission timeout of an
unacknowledged packet. Is there any workaround to change the
transmission timeout?

Best regards,
Zhitao Li

On Wed, May 29, 2024 at 6:18 PM Zhitao Li <zhitao.li@xxxxxxxxxx> wrote:
>
> Essentially, we need a mechanism to quickly reconnect with new
> nfs-server nodes for failover.
> I also tried to adjust mount options like "timeo" to 10s and "retrans"
> to 1,  and found that they don't work, either.  It seems that the NFS
> v3 client always tries to reconnect after some request hangs for 3
> minutes no matter what "timeo" and "retrans" is.
>
> On Wed, May 29, 2024 at 6:10 PM Zhitao Li <zhitao.li@xxxxxxxxxx> wrote:
> >
> > Hi, dear community,
> >
> > In our NFS environment, NFS client mounts remote NFS export with its
> > VIP. The VIP can be assigned to another server node for failover.
> > However, the NFS client sends the unacknowledged packet 50s+ after the
> > VIP is ready on the new node, which is because of the exponential
> > backoff retransmission algorithm.  I tried to set this parameter
> > "tcp_retries2" smaller so that the NFS client can reconnect with the
> > new node more quickly, but this parameter didn't take effect. From
> > tcpdump entries as follows,
> >   1. At "2024-05-29 11:47:00",  ARP is updated.
> >   2. At "2024-05-29 11:47:52" ,  the NFS client retried to send the packet.
> >   3. Then the connection is reset and a new connection starts.
> >
> > I guess the parameter just takes effect for applications and doesn't
> > take effect for kernel modules like the NFS client. Could anyone give
> > some advice to customize  retransmission timeout of unacknowledged NFS
> > v3 TCP packet?
> >
> >
> > OS: Linux kernel v6.7.0
> > NFS mount options:
> > vers=3,nolock,proto=tcp,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport
> >
> > tcp_retries2:
> > [root@vm-play zhitaoli]# sysctl -w net.ipv4.tcp_retries2=5
> > net.ipv4.tcp_retries2 = 5
> > [root@vm-play zhitaoli]# cat /proc/sys/net/ipv4/tcp_retries2
> > 5
> >
> > tcpdump entries:
> >
> > 2024-05-29 11:46:02.331891 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973659245 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:02.542836 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973659456 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:02.751013 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973659664 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:03.166958 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973660080 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:04.046882 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973660960 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:05.710910 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973662624 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:09.039310 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973665952 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:16.017889 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973672930 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:29.326891 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973686240 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:46:55.950915 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973712864 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:47:00.379844 52:54:00:13:1f:34 > Broadcast, ethertype
> > ARP (0x0806), length 60: Reply 10.125.1.85 is-at 52:54:00:13:1f:34,
> > length 46
> >
> > 2024-05-29 11:47:52.271192 52:54:00:1d:a4:24 > 52:54:00:13:1f:34,
> > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > options [nop,nop,TS val 1973769184 ecr 28456
> > 58566], length 124: NFS request xid 1954624602 120 access fh
> > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> >
> > 2024-05-29 11:47:52.272041 52:54:00:13:1f:34 > 52:54:00:1d:a4:24,
> > ethertype IPv4 (0x0800), length 54: 10.125.1.85.nfs >
> > 10.125.1.214.58428: Flags [R], seq 1148562527, win 0, length 0
> >
> > 2024-05-29 11:47:52.272909 52:54:00:1d:a4:24 > 52:54:00:13:1f:34,
> > ethertype IPv4 (0x0800), length 74: 10.125.1.214.58428 >
> > 10.125.1.85.nfs: Flags [S], seq 1734997801, win 32120, options [mss
> > 1460,sackOK,TS val 1973769186 ecr 0,nop,wscale 7], length 0
> >
> > 2024-05-29 11:47:52.273503 52:54:00:13:1f:34 > 52:54:00:1d:a4:24,
> > ethertype IPv4 (0x0800), length 74: 10.125.1.85.nfs >
> > 10.125.1.214.58428: Flags [S.], seq 1078843840, ack 1734997802, win
> > 28960, options [mss 1460,sackOK,TS val 2235915769 ecr
> > 1973769186,nop,wscale 7], length 0
> >
> >
> > Best regards,
> > Zhitao Li





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux