On 9 Dec 2023, at 4:55, Trond Myklebust wrote: > On Fri, 2023-12-08 at 14:19 -0500, Benjamin Coddington wrote: >> After commit 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out >> when on >> the sending list"), any 4.1 backchannel tasks placed on the sending >> queue >> would immediately return with -ETIMEDOUT since their req timers are >> zero. >> We can fix this by keeping a copy of the rpc_clnt's timeout params on >> the >> transport and using them to properly setup the timeouts on the v4.1 >> backchannel tasks' req. >> >> Fixes: 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on >> the sending list") >> Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx> >> --- >> include/linux/sunrpc/xprt.h | 1 + >> net/sunrpc/clnt.c | 3 +++ >> net/sunrpc/xprt.c | 23 ++++++++++++++--------- >> 3 files changed, 18 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/sunrpc/xprt.h >> b/include/linux/sunrpc/xprt.h >> index f85d3a0daca2..7565902053f3 100644 >> --- a/include/linux/sunrpc/xprt.h >> +++ b/include/linux/sunrpc/xprt.h >> @@ -285,6 +285,7 @@ struct rpc_xprt { >> * items */ >> struct list_head bc_pa_list; /* List of >> preallocated >> * backchannel >> rpc_rqst's */ >> + struct rpc_timeout bc_timeout; /* backchannel >> timeout params */ >> #endif /* CONFIG_SUNRPC_BACKCHANNEL */ >> >> struct rb_root recv_queue; /* Receive queue */ >> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >> index d6805c1268a7..5891757c88b1 100644 >> --- a/net/sunrpc/clnt.c >> +++ b/net/sunrpc/clnt.c >> @@ -279,6 +279,9 @@ static struct rpc_xprt >> *rpc_clnt_set_transport(struct rpc_clnt *clnt, >> clnt->cl_autobind = 1; >> >> clnt->cl_timeout = timeout; >> +#if defined(CONFIG_SUNRPC_BACKCHANNEL) >> + memcpy(&xprt->bc_timeout, timeout, sizeof(struct >> rpc_timeout)); >> +#endif > > Hmm... The xprt can and will be shared among a number of rpc_clnt > instances. I therefore think we're better off doing this when we're > setting up the back channel. .. and it seems the timeouts could be different for each, so now I think keeping a copy of the last rpc_clnt's timeouts on the xprt is wrong. We could use the current timeouts from the nfs_client side after we figure out which nfs_client that would be in nfs4_callback_compound(). Trouble is if we have to make a reply before getting that far there's still no timeout set, but that's probably a rare case and could use the xprt type defaults. > i.e. probably doing it in nfs41_init_clientid() after we picked up the > lease time (but before we mark the client as ready), and then doing it > in nfs4_proc_bind_conn_to_session() if ever that gets called. > > Note that we have to set the bc_timeout on all xprts that could act as > back channels, so you might want to use > rpc_clnt_iterate_for_each_xprt(). > It might also be worth to look at Olga's trunking code, since I suspect > we might need to do something there when adding a new xprt to the > existing set. Ben