Re: [[RFC] 1/1] SUNRPC: dynamic rpc_slot allocator for TCP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 03 May 2011 20:44:50 -0400 Trond Myklebust
<Trond.Myklebust@xxxxxxxxxx> wrote:

> On Tue, 2011-05-03 at 20:20 -0400, Jeff Layton wrote:
> > On Mon,  2 May 2011 21:40:08 -0400
> > andros@xxxxxxxxxx wrote:
> > 
> > > From: Andy Adamson <andros@xxxxxxxxxx>
> > > 
> > > Hookup TCP congestion feedback into rpc_slot allocation so that the RPC layer
> > > can fully utilize the negotiated TCP window.
> > > 
> > > Use a slab cache for rpc_slots. Statically allocate an rpc_xprt rpc_slot slab
> > > cache using GFP_KERNEL to the RPC_DEF_SLOT_TABLE number of slots at
> > > rpc_xprt allocation.
> > > 
> > > Add a dynamic rpc slot allocator to rpc_xprt_ops which is set only for TCP.
> > > For TCP, trigger a dyamic slot allocation in response to a write_space
> > > callback which is in turn called when the TCP layer is waiting for buffer space.
> > > 
> > > Dynamically add a slot at the beginning of the RPC call_transmit state. The slot
> > > allocator uses GFP_NOWAIT and will return without allocating a slot if
> > > GFP_NOWAIT allocation fails. This is OK because the write_space callback will
> > > be called again, and the dynamic slot allocator can retry.
> > > 
> > > Signed-off-by: Andy Adamson <andros@xxxxxxxxx>
> > > ---
> > >  include/linux/sunrpc/sched.h |    2 +
> > >  include/linux/sunrpc/xprt.h  |    6 +++-
> > >  net/sunrpc/clnt.c            |    4 ++
> > >  net/sunrpc/sched.c           |   39 ++++++++++++++++++++++
> > >  net/sunrpc/xprt.c            |   75 +++++++++++++++++++++++++++++++++++++-----
> > >  net/sunrpc/xprtsock.c        |    1 +
> > >  6 files changed, 117 insertions(+), 10 deletions(-)
> > > 
> > 
> > Nice work, comments inline below...
> > 
> > 
> > [...]
> > 
> > > +
> > > +/*
> > > + * Static transport rpc_slot allocation called only at rpc_xprt allocation.
> > > + * No need to take the xprt->reserve_lock.
> > > + */
> > > +int
> > > +xprt_alloc_slot_entries(struct rpc_xprt *xprt, int num_req)
> > > +{
> > > +	struct rpc_rqst *req;
> > > +	int i;
> > > +
> > > +	for (i = 0; i < num_req; i++) {
> > > +		req = rpc_alloc_slot(GFP_KERNEL);
> > > +		if (!req)
> > > +			return -ENOMEM;
> > > +		memset(req, 0, sizeof(*req));
> > > +		list_add(&req->rq_list, &xprt->free);
> > > +	}
> > > +	dprintk("<-- %s mempool_alloc %d reqs\n", __func__,
> > > +		xprt->max_reqs);
> > > +	return 0;
> > > +}
> > > +
> > 
> > So, I don't quite get this...
> > 
> > You declare a global mempool early on, and then allocate from that
> > mempool for a list of static entries. Doesn't that sort of rob you of
> > any benefit of using a mempool here? IOW, won't the static allocations
> > potentially rob the mempool of "guaranteed" entries such that the
> > dynamic ones eventually all turn into slab allocations anyway?
> 
> The other thing is that we really should _never_ be using GFP_KERNEL
> allocations in any paths that might be used for writebacks, since that
> might recurse back into the filesystem due to direct memory reclaims. In
> fact, since this code path will commonly be used by rpciod, then we
> should rather be using GFP_ATOMIC and/or GFP_NOWAIT (see rpc_malloc).
> 
> > What I think would make more sense would be to have multiple mempools
> > -- one per xprt and simply set the mempool size to the number of
> > "static" entries that you want for the mempool. Then you could get rid
> > of the free list, and just do allocations out of the mempool directly.
> > You'll be guaranteed to be able to allocate up to the number in the
> > mempool and everything above that would just becomes a slab allocation.
> 
> I'm not sure that we need multiple mempools: all we really require is
> that memory reclaims to be able to make some form of progress.
> I'd therefore rather advocate a global mempool (which matches our policy
> on the rpc buffer pool), but that we allow the allocations to fail if
> we're really low on memory.
> 
> The problem with multiple mempools is that if there is no activity on
> one of those filesystems, then we're basically locking up memory for no
> good reason.
> 

It might be worth pointing out that mempools only use the reserved memory as
a last resort.   They try a normal allocation first and only when that fails
do they dip into the pool.  This is because normal allocation avoids taking
global locks on common case - they are CPU-local.  The pool is a global
resource (i.e. for all CPUs) so a spinlock is needed.

As GFP_KERNEL essentially never fails, it will never dip into the pool, so
there is not a lot of point using it in a mempool allocation (though I guess
it could make the code cleaner/neater and that would be a good thing).

In general, mempools should be sized to the minimum number of entries
necessary to make forward progress when memory pressure is high.  This is
typically 1.  Making mempools larger than this might allow throughput to be a
bit higher during that high-memory-pressure time, but wastes memory all the
rest of the time.

You only need multiple mempools when they might be used on the one request.
i.e. if two separate allocation are needed to handle the one request, then
they must come from two separate mempools.
If two different allocations will always be for two independent requests, then
they can safely share a mempool.

For rpc slots, I doubt you need mempools at all.
Mempools are only needed if failure is not an option and you would rather
wait, but you cannot wait for regular writeback because you are on the
writeback path.  So you use a mempool which waits for a previous request to
complete.  I don't think that describes rpc slots at all.
For rpc slots, you can afford to wait when setting up the transport, as you
are not on the writeout path yet, and later you can always cope with failure.
So just use kmalloc.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux