On Wed, 4 May 2011 13:22:01 -0400 Andy Adamson <andros@xxxxxxxxxx> wrote: > > On May 4, 2011, at 12:01 PM, Chuck Lever wrote: > > > > > On May 4, 2011, at 11:52 AM, Andy Adamson wrote: > > > >> > >> On May 4, 2011, at 11:18 AM, Jeff Layton wrote: > >> > >>> On Wed, 4 May 2011 10:54:57 -0400 > >>> Andy Adamson <andros@xxxxxxxxxx> wrote: > >>> > >>>> > >>>> On May 4, 2011, at 7:54 AM, Jeff Layton wrote: > >>>> > >>>>> On Wed, 4 May 2011 12:07:26 +1000 > >>>>> NeilBrown <neilb@xxxxxxx> wrote: > >>>>> > >>>>>> On Tue, 03 May 2011 21:46:03 -0400 Trond Myklebust > >>>>>> <Trond.Myklebust@xxxxxxxxxx> wrote: > >>>>>> > >>>>>>> On Wed, 2011-05-04 at 11:18 +1000, NeilBrown wrote: > >>>>>>>> For rpc slots, I doubt you need mempools at all. > >>>>>>>> Mempools are only needed if failure is not an option and you would rather > >>>>>>>> wait, but you cannot wait for regular writeback because you are on the > >>>>>>>> writeback path. So you use a mempool which waits for a previous request to > >>>>>>>> complete. I don't think that describes rpc slots at all. > >>>>>>>> For rpc slots, you can afford to wait when setting up the transport, as you > >>>>>>>> are not on the writeout path yet, and later you can always cope with failure. > >>>>>>>> So just use kmalloc. > >>>>>>> > >>>>>>> Errr.... No. By the time you get to allocating an RPC slot, you may be > >>>>>>> bang smack in the middle of the writeout path. > >>>>>>> > >>>>>>> The scenario would be that something triggers a writeback (kswapd, > >>>>>>> direct reclaim,...) which triggers an RPC call, which requires you to > >>>>>>> allocate at least one rpc slot before you can put the write on the wire. > >>>>>>> > >>>>>>> I agree with your assertion that we only need one successful slot > >>>>>>> allocation in order to make overall forward progress, but we definitely > >>>>>>> do need that one... > >>>>>>> > >>>>>> > >>>>>> Probably I misunderstood the code, but I thought that there was a base set of > >>>>>> slots preallocated. Eventually one of those will become free won't it? > >>>>>> > >>>>>> Looking at the code again, it only ever returns entries to the mempool when > >>>>>> it is about to destroy the xprt. That makes no sense. If you are using a > >>>>>> mempool, then you allocate when you need to use space, and free it as soon as > >>>>>> you have finished with it, so the next request can get a turn. > >>>>>> > >>>>> > >>>>> That was sort of my point, though I didn't articulate it very well. > >>>>> > >>>>> I don't think we want to keep this "static" set of rqsts and the > >>>>> mempool. It seems like that's somewhat redundant and more complex than > >>>>> is warranted. The question is...which should we keep? > >>>>> > >>>>> I think it sort of depends on what the intended model is. If we want to > >>>>> allocate out of the "guaranteed" pool first and only when that's > >>>>> exhausted fall back to doing dynamic allocations, then we probably want > >>>>> to keep the static list of entries. > >>>>> > >>>>> If we want to allocate dynamically most of the time and only dip into > >>>>> the "guaranteed" pool when that can't be satisfied, then basing all of > >>>>> this around mempools might make more sense. > >>>> > >>>> Thanks for the review - I've learned a lot about mempools from this discussion. > >>>> > >>>> I've noticed that a lot of the time, (at least for TCP) rpc_slots are allocated and not used. > >>>> The default slot entries value is a guess. This is why I like the idea of dynamically allocating most > >>>> of the time, and only dipping into the guaranteed pool when the dynamic allocation > >>>> can't be satisfied. e.g. get rid of the static set of slots. > >>>> Does this design require two mempools, one for dynamic allocation and a guaranteed one being small and "in reserve"? > >>>> > >>> > >>> No, a single mempool would give you that. Just declare the mempool with > >>> the amount of "emergency" slots you want, and then allocate directly > >>> from it and then free them back to the mempool. Most of the time (as > >>> Neil points out) you'll get them allocated from the slabcache. It'll > >>> it'll only dip into the reserved set when those can't be allocated. > >> > >> Cool. > >> > >>> > >>> The other thing you have to determine is whether you want this > >>> guaranteed number of slots to be per-xprt or global. Today, it's > >>> per-xprt. Making it global might mean less wasted memory overall, but I > >>> could forsee situations where a hung mount could starve other mounts of > >>> slots. > >>> > >>> Figuring that out might help guide the overall design here... > >>> > >>>> For UDP and RDMA, the default # of slot entries can turn into an upper bound. > >>>> For TCP, the default # of slot entries would have no meaning WRT upper bound, as the upper bound is determined > >>>> by the TCP window size. > >>>> > >>> > >>> Maybe rather than having this dynamic_slot_alloc operation that's only > >>> conditionally called, it would cleaner to abstract out the slot > >>> allocator altogether. > >>> > >>> UDP and RDMA could use the same routine and just allocate out of a > >>> fixed set of preallocated slots (sort of like how they do today), and > >>> TCP could use this new scheme. > >> > >> We need rpc_slots (UDP/RDMA/TCP/....) to be freed back to the mempool in xprt_free_slot. > >> If we allocate the set # of rpc_slots, place them on the xprt->free list, and they don't get used, > >> then they won't be freed until the xprt is destroyed. > > > > The reason to keep slots around and on a local free list is that full-out memory allocation is heavy weight. I keep thinking of the talk Willy gave at LSF about how much scalability was affected in the SCSI I/O path by simple memory allocation operations. Is using a SLAB or SLUB as fast as what we have today? > > That is a good question. We could implement this same slot allocation scheme without a SLAB/SLUB - just use kmalloc and leave xprt_free_slot to just move the allocated slots to the free list as is done today. I don't have any information on a performance hit. > > Is this worth testing, or should we just stick with kmalloc and friends? > I think Chuck's point here is that there is an argument for doing it the other way around -- keeping a static set of objects and only dynamically allocating when those are used up. Of course, pulling them off the free list is not cost-free either (you do need to take a spinlock there), but it may have less contention. It's hard for me to imagine that this will make a big difference overall, but I guess you never know. Personally, I wouldn't worry about it. :) -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html