Re: Question: How to customize retransmission timeout of unacknowledged NFS v3 TCP packet?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Could anyone give some help about this issue? I've spent some days on
this issue, both "tcp_retries2" and mount options like "timeo" and
"retrans" do not work to give up retransmission earlier.

Regards,
Zhitao Li.

On Fri, May 31, 2024 at 3:28 PM Zhitao Li <zhitao.li@xxxxxxxxxx> wrote:
>
> This problem is duplicated with
> https://lore.kernel.org/linux-nfs/YQBPR01MB10724B629B69F7969AC6BDF9586C89@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> According to the discussion, the patch is submitted to fix the timeout
> used in xprt_socket, which is still on the way. On the other hand,
> "tcp_retries2" doesn't work in control the transmission timeout of an
> unacknowledged packet. Is there any workaround to change the
> transmission timeout?
>
> Best regards,
> Zhitao Li
>
> On Wed, May 29, 2024 at 6:18 PM Zhitao Li <zhitao.li@xxxxxxxxxx> wrote:
> >
> > Essentially, we need a mechanism to quickly reconnect with new
> > nfs-server nodes for failover.
> > I also tried to adjust mount options like "timeo" to 10s and "retrans"
> > to 1,  and found that they don't work, either.  It seems that the NFS
> > v3 client always tries to reconnect after some request hangs for 3
> > minutes no matter what "timeo" and "retrans" is.
> >
> > On Wed, May 29, 2024 at 6:10 PM Zhitao Li <zhitao.li@xxxxxxxxxx> wrote:
> > >
> > > Hi, dear community,
> > >
> > > In our NFS environment, NFS client mounts remote NFS export with its
> > > VIP. The VIP can be assigned to another server node for failover.
> > > However, the NFS client sends the unacknowledged packet 50s+ after the
> > > VIP is ready on the new node, which is because of the exponential
> > > backoff retransmission algorithm.  I tried to set this parameter
> > > "tcp_retries2" smaller so that the NFS client can reconnect with the
> > > new node more quickly, but this parameter didn't take effect. From
> > > tcpdump entries as follows,
> > >   1. At "2024-05-29 11:47:00",  ARP is updated.
> > >   2. At "2024-05-29 11:47:52" ,  the NFS client retried to send the packet.
> > >   3. Then the connection is reset and a new connection starts.
> > >
> > > I guess the parameter just takes effect for applications and doesn't
> > > take effect for kernel modules like the NFS client. Could anyone give
> > > some advice to customize  retransmission timeout of unacknowledged NFS
> > > v3 TCP packet?
> > >
> > >
> > > OS: Linux kernel v6.7.0
> > > NFS mount options:
> > > vers=3,nolock,proto=tcp,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport
> > >
> > > tcp_retries2:
> > > [root@vm-play zhitaoli]# sysctl -w net.ipv4.tcp_retries2=5
> > > net.ipv4.tcp_retries2 = 5
> > > [root@vm-play zhitaoli]# cat /proc/sys/net/ipv4/tcp_retries2
> > > 5
> > >
> > > tcpdump entries:
> > >
> > > 2024-05-29 11:46:02.331891 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973659245 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:02.542836 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973659456 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:02.751013 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973659664 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:03.166958 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973660080 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:04.046882 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973660960 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:05.710910 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973662624 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:09.039310 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973665952 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:16.017889 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973672930 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:29.326891 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973686240 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:46:55.950915 52:54:00:1d:a4:24 > 52:54:00:a0:93:93,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973712864 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:47:00.379844 52:54:00:13:1f:34 > Broadcast, ethertype
> > > ARP (0x0806), length 60: Reply 10.125.1.85 is-at 52:54:00:13:1f:34,
> > > length 46
> > >
> > > 2024-05-29 11:47:52.271192 52:54:00:1d:a4:24 > 52:54:00:13:1f:34,
> > > ethertype IPv4 (0x0800), length 190: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [P.], seq 129897:130021, ack 171633, win 2356,
> > > options [nop,nop,TS val 1973769184 ecr 28456
> > > 58566], length 124: NFS request xid 1954624602 120 access fh
> > > Unknown/43000001180100000000000000DE40020000000000F439000000000000000000
> > > NFS_ACCESS_READ|NFS_ACCESS_LOOKUP|NFS_ACCESS_MODIFY|NFS_ACCESS_EXTEND|NFS_ACCESS_DELETE
> > >
> > > 2024-05-29 11:47:52.272041 52:54:00:13:1f:34 > 52:54:00:1d:a4:24,
> > > ethertype IPv4 (0x0800), length 54: 10.125.1.85.nfs >
> > > 10.125.1.214.58428: Flags [R], seq 1148562527, win 0, length 0
> > >
> > > 2024-05-29 11:47:52.272909 52:54:00:1d:a4:24 > 52:54:00:13:1f:34,
> > > ethertype IPv4 (0x0800), length 74: 10.125.1.214.58428 >
> > > 10.125.1.85.nfs: Flags [S], seq 1734997801, win 32120, options [mss
> > > 1460,sackOK,TS val 1973769186 ecr 0,nop,wscale 7], length 0
> > >
> > > 2024-05-29 11:47:52.273503 52:54:00:13:1f:34 > 52:54:00:1d:a4:24,
> > > ethertype IPv4 (0x0800), length 74: 10.125.1.85.nfs >
> > > 10.125.1.214.58428: Flags [S.], seq 1078843840, ack 1734997802, win
> > > 28960, options [mss 1460,sackOK,TS val 2235915769 ecr
> > > 1973769186,nop,wscale 7], length 0
> > >
> > >
> > > Best regards,
> > > Zhitao Li





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux