On Apr 13, 2011, at 1:20 PM, Jeff Layton wrote: > On Wed, 13 Apr 2011 10:22:13 -0400 > Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> wrote: > >> On Wed, 2011-04-13 at 10:02 -0400, Jeff Layton wrote: >>> We could put the rpc_rqst's into a slabcache, and give each rpc_xprt a >>> mempool with a minimum number of slots. Have them all be allocated with >>> GFP_NOWAIT. If it gets a NULL pointer back, then the task can sleep on >>> the waitqueue like it does today. Then, the clients can allocate >>> rpc_rqst's as they need as long as memory holds out for it. >>> >>> We have the reserve_xprt stuff to handle congestion control anyway so I >>> don't really see the value in the artificial limits that the slot table >>> provides. >>> >>> Maybe I should hack up a patchset for this... >> >> This issue has come up several times recently. My preference would be to >> tie the availability of slots to the TCP window size, and basically say >> that if the SOCK_ASYNC_NOSPACE flag is set on the socket, then we hold >> off allocating more slots until we get a ->write_space() callback which >> clears that flag. >> >> For the RDMA case, we can continue to use the current system of a fixed >> number of preallocated slots. >> > > I take it then that we'd want a similar scheme for UDP as well? I guess > I'm just not sure what the slot table is supposed to be for. [andros] I look at the rpc_slot table as a representation of the amount of data the connection to the server can handle - basically the #slots should = double the bandwidth-delay product divided by the max(rsize/wsize). For TCP, this is the window size. (ping of max MTU packet * interface bandwidth). There is no reason to allocate more rpc_rqsts that can fit on the wire. > > Possibly naive question, and maybe you or Andy have scoped this out > already... > > Wouldn't it make more sense to allow the code to allocate rpc_rqst's as > needed, and manage congestion control in reserve_xprt ? [andros] Congestion control is not what the rpc_slot table is managing. It does need to have a minimum which experience has set at 16. It's the maximum that needs to be dynamic. Congestion control by the lower layers should work unfettered within the # of rpc_slots. Today that is not always the case when 16 slots is not enough to fill the wire, and the administrator has not changed the # of rpc_slots. > It appears that > that at least is what xprt_reserve_xprt_cong is supposed to do. The TCP > variant (xprt_reserve_xprt) doesn't do that currently, but we could do > it there and that would seem to make for more parity between the TCP > and UDP in this sense. > > We could do that similarly for RDMA too. Simply keep track of how many > RPCs are in flight and only allow reserving the xprt when that number > hasn't crossed the max number of slots... > > -- > Jeff Layton <jlayton@xxxxxxxxxx> > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html