Re: [PATCH v1 3/8] svcrdma: Add svc_rdma_get_context() API that is allowed to fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Nov 24, 2015, at 1:55 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> 
>> +struct svc_rdma_op_ctxt *svc_rdma_get_context_gfp(struct svcxprt_rdma *xprt,
>> +						  gfp_t flags)
>> +{
>> +	struct svc_rdma_op_ctxt *ctxt;
>> +
>> +	ctxt = kmem_cache_alloc(svc_rdma_ctxt_cachep, flags);
>> +	if (!ctxt)
>> +		return NULL;
>> +	svc_rdma_init_context(xprt, ctxt);
>> +	return ctxt;
>> +}
>> +
>> +struct svc_rdma_op_ctxt *svc_rdma_get_context(struct svcxprt_rdma *xprt)
>> +{
>> +	struct svc_rdma_op_ctxt *ctxt;
>> +
>> +	ctxt = kmem_cache_alloc(svc_rdma_ctxt_cachep,
>> +				GFP_KERNEL | __GFP_NOFAIL);
>> +	svc_rdma_init_context(xprt, ctxt);
>> 	return ctxt;
> 
> Sounds like you should have just added a gfp_t argument to
> svc_rdma_get_context.  And if we have any way to avoid the __GFP_NOFAIL
> I'd really appreciate if we could give that a try.

Changed my mind on this.

struct svc_rdma_op_ctxt used to be smaller than a page, so these
allocations were not likely to fail. But since the maximum NFS
READ and WRITE payload for NFS/RDMA has been increased to 1MB,
struct svc_rdma_op_ctxt has grown to more than 6KB, thus it is
no longer an order 0 memory allocation.

Some ideas:

1. Pre-allocate these per connection in svc_rdma_accept().
There will never be more than sc_sq_depth of these. But that
could be a large number to allocate during connection
establishment.

2. Once allocated, cache them. If traffic doesn’t manage to
allocate sc_sq_depth of these over time, allocation can still
fail during a traffic burst in very low memory scenarios.

3. Use a mempool. This reserves a few of these which may never
be used. But allocation can still fail once the reserve is
consumed (same as 2).

4. Break out the sge and pages arrays into separate allocations
so the allocation requests are order 0.

1 seems like the most robust solution, and it would be fast.
svc_rdma_get_context is a very common operation.


--
Chuck Lever




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux