> From: Andrew Klaassen <andrew.klaassen@xxxxxxxxxxxxxx> > Sent: Thursday, January 26, 2023 10:32 AM > > > From: Andrew Klaassen <andrew.klaassen@xxxxxxxxxxxxxx> > > Sent: Monday, January 23, 2023 11:31 AM > > > > Hello, > > > > There's a specific NFSv4 mount on a specific machine which we'd like > > to timeout and return an error after a few seconds if the server goes away. > > > > I've confirmed the following on two different kernels, 4.18.0- > > 348.12.2.el8_5.x86_64 and 6.1.7-200.fc37.x86_64. > > > > I've been able to get both autofs and the mount command to cooperate, > > so that the mount attempt fails after an arbitrary number of seconds. > > This mount command, for example, will fail after 6 seconds, as > > expected based on the timeo=20,retrans=2,retry=0 options: > > > > $ time sudo mount -t nfs4 -o > > rw,relatime,sync,vers=4.2,rsize=131072,wsize=131072,namlen=255,acregmi > > n > > > =0,acregmax=0,acdirmin=0,acdirmax=0,soft,noac,proto=tcp,timeo=20,retra > > n s=2,retry=0,sec=sys thor04:/mnt/thorfs04 /mnt/thor04 > > mount.nfs4: Connection timed out > > > > real 0m6.084s > > user 0m0.007s > > sys 0m0.015s > > > > However, if the share is already mounted and the server goes away, the > > timeout is always 2 minutes plus the time I expect based on timeo and > > retrans. In this case, 2 minutes and 6 seconds: > > > > $ time ls /mnt/thor04 > > ls: cannot access '/mnt/thor04': Connection timed out > > > > real 2m6.025s > > user 0m0.003s > > sys 0m0.000s > > > > Watching the outgoing packets in the second case, the pattern is > > always the > > same: > > - 0.2 seconds between the first two, then doubling each time until > > the two minute mark is exceeded (so the last NFS packet, which is > > always the 11th packet, is sent around 1:45 after the first). > > - Then some generic packets that start exactly-ish on the two minute > > mark, 1 second between the first two, then doubling each time. (By > > this time the NFS command has given up.) > > > > 11:10:21.898305 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834889483 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:22.105189 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834889690 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:22.313290 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834889898 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:22.721269 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834890306 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:23.569192 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834891154 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:25.233212 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834892818 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:28.497282 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834896082 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:35.025219 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834902610 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:10:48.337201 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834915922 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:11:14.449303 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834942034 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:12:08.721251 IP 10.30.13.2.916 > 10.31.3.13.2049: Flags [P.], seq > > 14452:14652, ack 18561, win 501, options [nop,nop,TS val 834996306 ecr > > 1589769203], length 200: NFS request xid 3614904256 196 getattr fh > > 0,2/53 > > 11:12:22.545394 IP 10.30.13.2.942 > 10.31.3.13.2049: Flags [S], seq > > 1375256951, win 64240, options [mss 1460,sackOK,TS val 835010130 ecr > > 0,nop,wscale 7], length 0 > > 11:12:23.570199 IP 10.30.13.2.942 > 10.31.3.13.2049: Flags [S], seq > > 1375256951, win 64240, options [mss 1460,sackOK,TS val 835011155 ecr > > 0,nop,wscale 7], length 0 > > 11:12:25.617284 IP 10.30.13.2.942 > 10.31.3.13.2049: Flags [S], seq > > 1375256951, win 64240, options [mss 1460,sackOK,TS val 835013202 ecr > > 0,nop,wscale 7], length 0 > > 11:12:29.649219 IP 10.30.13.2.942 > 10.31.3.13.2049: Flags [S], seq > > 1375256951, win 64240, options [mss 1460,sackOK,TS val 835017234 ecr > > 0,nop,wscale 7], length 0 > > 11:12:37.905274 IP 10.30.13.2.942 > 10.31.3.13.2049: Flags [S], seq > > 1375256951, win 64240, options [mss 1460,sackOK,TS val 835025490 ecr > > 0,nop,wscale 7], length 0 > > 11:12:54.289212 IP 10.30.13.2.942 > 10.31.3.13.2049: Flags [S], seq > > 1375256951, win 64240, options [mss 1460,sackOK,TS val 835041874 ecr > > 0,nop,wscale 7], length 0 > > 11:13:26.545304 IP 10.30.13.2.942 > 10.31.3.13.2049: Flags [S], seq > > 1375256951, win 64240, options [mss 1460,sackOK,TS val 835074130 ecr > > 0,nop,wscale 7], length 0 > > > > I tried changing tcp_retries2 as suggested in another thread from this list: > > > > # echo 3 > /proc/sys/net/ipv4/tcp_retries2 > > > > ...but it made no difference on either kernel. The 2 minute timeout > > also doesn't seem to match with what I'd calculate from the initial > > value of tcp_retries2, which should give a much higher timeout. > > > > The only clue I've been able to find is in the retry=n entry in the > > NFS > > manpage: > > > > " For TCP the default is 3 minutes, but system TCP connection timeouts > > will sometimes limit the timeout of each retransmission to around 2 > minutes." > > > > What I'm not able to make sense of: > > - The retry option says that it applies to mount operations, not > > read/write operations. However, in this case I'm seeing the 2 minute > > delay on read/write operations but *not* mount operations. > > - A couple of hours of searching didn't lead me to any kernel > > settings that would result in a 2 minute timeout. > > > > Does anyone have any clues about a) what's happening and b) how to get > > our desired behaviour of being able to control both mount and > > read/write timeouts down to a few seconds? > > > > Thanks. > > I thought that changing TCP_RTO_MAX in include/net/tcp.h from 120 to > something smaller and recompiling the kernel would change the 2 minute > timeout, but it had no effect. I'm going to keep poking through the kernel > code to see if there's a knob I can turn to change the 2 minute timeout, so > that I can at least understand where it's coming from. > > Any hints as to where I should be looking? I believe I've made some progress with this today: - Calls to rpc_create() from fs/nfs/client.c are sending an rpc_timeout struct with their args. - rpc_create() does *not* pass the timeout on to xprt_create_transport(), which then can't pass it on to xs_setup_tcp(). - xs_setup_tcp(), having no timeout passed to it, uses xs_tcp_default_timeout instead. - changing xs_tcp_default_timeout changes the "ls" timeout behaviour I described above. In theory all of this means that the timeout simply needs to be passed through and used instead of xs_tcp_default_timeout. I'm going to give this a try tomorrow. Here's what I'm going to try first; I'm no C programmer, though, so any advice or corrections you might have would be appreciated. Thanks. Andrew diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 0b0b9f1eed46..1350c1f489f7 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -532,6 +532,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) .addrlen = args->addrsize, .servername = args->servername, .bc_xprt = args->bc_xprt, + .timeout = args->timeout, }; char servername[48]; struct rpc_clnt *clnt; diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index aaa5b2741b79..adc79d94b59e 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -3003,7 +3003,7 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args) xprt->idle_timeout = XS_IDLE_DISC_TO; xprt->ops = &xs_tcp_ops; - xprt->timeout = &xs_tcp_default_timeout; + xprt->timeout = args->timeout; xprt->max_reconnect_timeout = xprt->timeout->to_maxval; xprt->connect_timeout = xprt->timeout->to_initval *