On 4 Jan 2024, at 6:55, Benjamin Coddington wrote: > On 3 Jan 2024, at 18:00, Trond Myklebust wrote: > >> On Wed, 2024-01-03 at 16:45 -0500, Anna Schumaker wrote: >>> Hi Ben, >>> >>> On Wed, Dec 13, 2023 at 2:49 PM Benjamin Coddington >>> <bcodding@xxxxxxxxxx> wrote: >>>> >>>> For backchannel requests that lookup the appropriate nfs_client, >>>> use the >>>> state-management rpc_clnt's rpc_timeout parameters for the >>>> backchannel's >>>> response. When the nfs_client cannot be found, fall back to using >>>> the >>>> xprt's default timeout parameters. >>> >>> I'm seeing a use-after-free after applying this patch when using pNFS >>> and session trunking. Any idea what's going on? Here is the stack >>> trace I'm seeing: >> >> I'm going to guess that this is happening because nothing is clearing >> rqstp->bc_rpc_clnt after a call to svc_process_bc(). So if something >> later calls CB_NULL, then the resulting svc_process_bc() will free an >> extra reference. > > Doh! > >>> >>> I hit this while testing against ontap, if that helps. >>> >>> Thanks, >>> Anna > > Thank you for the test!! > >>>> --- a/fs/nfs/callback_xdr.c >>>> +++ b/fs/nfs/callback_xdr.c >>>> @@ -967,6 +967,9 @@ static __be32 nfs4_callback_compound(struct >>>> svc_rqst *rqstp) >>>> nops--; >>>> } >>>> >>>> + if (svc_is_backchannel(rqstp) && cps.clp) >>>> + rqstp->bc_rpc_clnt = cps.clp->cl_rpcclient; >> >> >> You can re-create the clnt->cl_timeout in svc_process_bc() by just >> storing the values of to_initval and to_retrans here. Why store a >> reference to an entire rpc client structure that you don't need? > > Hmm, I think I started by thinking we could simply set tk_client, but then > didn't end up with that for other reasons and just kept passing the single > pointer. .. and the client being NULL was the signal to fallback to using the xprt timeouts. That's lost with to_initval and to_retries because they can be set to zero deliberately. I'm not immediately seeing an easy way to carry all that info across on the svc_rqst without three additional fields or another struct. Either we drop the part where we fallback to the xprt defaults, or things get a bit uglier. Ben