Re: [PATCH] SUNRPC: backchannel RPC request must reference XPRT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 15 2019, Trond Myklebust wrote:

> Hi Neil,
>
> On Tue, 2019-10-15 at 10:36 +1100, NeilBrown wrote:
>> The backchannel RPC requests - that are queued waiting
>> for the reply to be sent by the "NFSv4 callback" thread -
>> have a pointer to the xprt, but it is not reference counted.
>> It is possible for the xprt to be freed while there are
>> still queued requests.
>> 
>> I think this has been a problem since
>> Commit fb7a0b9addbd ("nfs41: New backchannel helper routines")
>> when the code was introduced, but I suspect it became more of
>> a problem after
>> Commit 80b14d5e61ca ("SUNRPC: Add a structure to track multiple
>> transports")
>> (or there abouts).
>> Before this second patch, the nfs client would hold a reference to
>> the xprt to keep it alive.  After multipath was introduced,
>> a client holds a reference to a swtich, and the switch can have
>> multiple
>> xprts which can be added and removed.
>> 
>> I'm not sure of all the causal issues, but this patch has
>> fixed a customer problem were an NFSv4.1 client would run out
>> of memory with tens of thousands of backchannel rpc requests
>> queued for an xprt that had been freed.  This was a 64K-page
>> machine so each rpc_rqst consumed more than 128K of memory.
>> 
>> Fixes: 80b14d5e61ca ("SUNRPC: Add a structure to track multiple
>> transports")
>> cc: stable@xxxxxxxxxxxxxxx (v4.6)
>> Signed-off-by: NeilBrown <neilb@xxxxxxx>
>> ---
>>  net/sunrpc/backchannel_rqst.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/net/sunrpc/backchannel_rqst.c
>> b/net/sunrpc/backchannel_rqst.c
>> index 339e8c077c2d..c95ca39688b6 100644
>> --- a/net/sunrpc/backchannel_rqst.c
>> +++ b/net/sunrpc/backchannel_rqst.c
>> @@ -61,6 +61,7 @@ static void xprt_free_allocation(struct rpc_rqst
>> *req)
>>  	free_page((unsigned long)xbufp->head[0].iov_base);
>>  	xbufp = &req->rq_snd_buf;
>>  	free_page((unsigned long)xbufp->head[0].iov_base);
>> +	xprt_put(req->rq_xprt);
>>  	kfree(req);
>>  }
>
> Would it perhaps make better sense to move the xprt_get() to
> xprt_lookup_bc_request() and to release it in xprt_free_bc_rqst()? 

... maybe ...

>
> Otherwise as far as I can tell, we will have freed slots on the xprt-
>>bc_pa_list that hold a reference to the transport itself, meaning that
> the latter never gets released.

Apart from cleanup-on-error paths, xprt_free_allocation() is called:
 - in xprt_destroy_bc() - at most 'max_reqs' times.
 - in xprt_free_bc_rqst(), if the bc_alloc_count >= bc_alloc_max

So when xprt_destroy_bc() is called (from nfs4_destroy_session, via
xprt_destroy_backchannel()), bc_alloc_max() is reduced, and possibly
the requests are freed and bc_alloc_count is reduced accordingly.
If the requests were busy, they won't have been freed then, but will
then be freed when xprt_free_bc_reqst is called - because bc_alloc_max
has been reduced.

Once nfs4_destroy_session() has been called on all session, and
xprt_free_bc_rqst() has been called on all active requests, I think
they should be no requests left to be holding a reference to
the xprt.

And if there were requests left, and if we change the refcount code
(like you suggest) so that they weren't holding a reference, then they
would never be freed. - freeing an xprt doesn't clean out the bc_pa_list.

Though ... the connection from a session to an xprt isn't permanent (I
guess, based on the rcu_read_lock in nfs4_destroy_session... which
itself seems a bit odd as it doesn't inc a refcount while holding the
lock).

So maybe the counts can get out of balance.

Conclusion: I'm not sure the ref counts are entirely correct, but I
   cannot see a benefit from moving the xprt_get/put requests like you
   suggest.

Thanks,
NeilBrown


>
>>  
>> @@ -85,7 +86,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt
>> *xprt, gfp_t gfp_flags)
>>  	if (req == NULL)
>>  		return NULL;
>>  
>> -	req->rq_xprt = xprt;
>> +	req->rq_xprt = xprt_get(xprt);
>>  	INIT_LIST_HEAD(&req->rq_bc_list);
>>  
>>  	/* Preallocate one XDR receive buffer */
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@xxxxxxxxxxxxxxx

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux