Re: [[RFC] 1/1] SUNRPC: dynamic rpc_slot allocator for TCP

Andy Adamson <andros@xxxxxxxxxx> · Wed, 4 May 2011 11:52:20 -0400

On May 4, 2011, at 11:18 AM, Jeff Layton wrote:

> On Wed, 4 May 2011 10:54:57 -0400
> Andy Adamson <andros@xxxxxxxxxx> wrote:
> 
>> 
>> On May 4, 2011, at 7:54 AM, Jeff Layton wrote:
>> 
>>> On Wed, 4 May 2011 12:07:26 +1000
>>> NeilBrown <neilb@xxxxxxx> wrote:
>>> 
>>>> On Tue, 03 May 2011 21:46:03 -0400 Trond Myklebust
>>>> <Trond.Myklebust@xxxxxxxxxx> wrote:
>>>> 
>>>>> On Wed, 2011-05-04 at 11:18 +1000, NeilBrown wrote:
>>>>>> For rpc slots, I doubt you need mempools at all.
>>>>>> Mempools are only needed if failure is not an option and you would rather
>>>>>> wait, but you cannot wait for regular writeback because you are on the
>>>>>> writeback path.  So you use a mempool which waits for a previous request to
>>>>>> complete.  I don't think that describes rpc slots at all.
>>>>>> For rpc slots, you can afford to wait when setting up the transport, as you
>>>>>> are not on the writeout path yet, and later you can always cope with failure.
>>>>>> So just use kmalloc.
>>>>> 
>>>>> Errr.... No. By the time you get to allocating an RPC slot, you may be
>>>>> bang smack in the middle of the writeout path.
>>>>> 
>>>>> The scenario would be that something triggers a writeback (kswapd,
>>>>> direct reclaim,...) which triggers an RPC call, which requires you to
>>>>> allocate at least one rpc slot before you can put the write on the wire.
>>>>> 
>>>>> I agree with your assertion that we only need one successful slot
>>>>> allocation in order to make overall forward progress, but we definitely
>>>>> do need that one...
>>>>> 
>>>> 
>>>> Probably I misunderstood the code, but I thought that there was a base set of
>>>> slots preallocated.  Eventually one of those will become free won't it?
>>>> 
>>>> Looking at the code again, it only ever returns entries to the mempool when
>>>> it is about to destroy the xprt.  That makes no sense.  If you are using a
>>>> mempool, then you allocate when you need to use space, and free it as soon as
>>>> you have finished with it, so the next request can get a turn.
>>>> 
>>> 
>>> That was sort of my point, though I didn't articulate it very well.
>>> 
>>> I don't think we want to keep this "static" set of rqsts and the
>>> mempool. It seems like that's somewhat redundant and more complex than
>>> is warranted. The question is...which should we keep?
>>> 
>>> I think it sort of depends on what the intended model is. If we want to
>>> allocate out of the "guaranteed" pool first and only when that's
>>> exhausted fall back to doing dynamic allocations, then we probably want
>>> to keep the static list of entries.
>>> 
>>> If we want to allocate dynamically most of the time and only dip into
>>> the "guaranteed" pool when that can't be satisfied, then basing all of
>>> this around mempools might make more sense.
>> 
>> Thanks for the review - I've learned a lot about mempools from this discussion.
>> 
>> I've noticed that a lot of the time, (at least for TCP) rpc_slots are allocated and not used.
>> The default slot entries value is a guess. This is why I like the idea of dynamically allocating most 
>> of the time, and only dipping into the guaranteed pool when the dynamic allocation
>> can't be satisfied. e.g. get rid of the static set of slots.
>> Does this design require two mempools, one for dynamic allocation and a guaranteed one being small and "in reserve"?
>> 
> 
> No, a single mempool would give you that. Just declare the mempool with
> the amount of "emergency" slots you want, and then allocate directly
> from it and then free them back to the mempool. Most of the time (as
> Neil points out) you'll get them allocated from the slabcache. It'll
> it'll only dip into the reserved set when those can't be allocated.

Cool.

> 
> The other thing you have to determine is whether you want this
> guaranteed number of slots to be per-xprt or global. Today, it's
> per-xprt. Making it global might mean less wasted memory overall, but I
> could forsee situations where a hung mount could starve other mounts of
> slots.
> 
> Figuring that out might help guide the overall design here...
> 
>> For UDP and RDMA, the default # of slot entries can turn into an upper bound.
>> For TCP, the default # of slot entries would have no meaning WRT upper bound, as the upper bound is determined
>> by the TCP window size.
>> 
> 
> Maybe rather than having this dynamic_slot_alloc operation that's only
> conditionally called, it would cleaner to abstract out the slot
> allocator altogether.
> 
> UDP and RDMA could use the same routine and just allocate out of a
> fixed set of preallocated slots (sort of like how they do today), and
> TCP could use this new scheme.

We need rpc_slots (UDP/RDMA/TCP/....) to be freed back to the mempool in xprt_free_slot.
If we allocate the set # of rpc_slots, place them on the xprt->free list, and they don't get used,
then they won't be freed until the xprt is destroyed.

I was thinking of using the same allocator for UDP/RDMA/TCP. With all transports, start off with a 
minimal # of pre-allocated slots (6? 4? 2?) and for UDP/RDMA trigger a dynamic allocation in xprt_alloc_slot
when there are no slots available and the # slots allocated is less than the default # slot entries for that transport.

-->Andy

> -- 
> Jeff Layton <jlayton@xxxxxxxxxx>

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html