On 4/13/11 5:42 PM, Trond Myklebust wrote:
On Wed, 2011-04-13 at 17:21 -0700, Dean wrote:
This issue has come up several times recently. My preference would be to
tie the availability of slots to the TCP window size, and basically say
that if the SOCK_ASYNC_NOSPACE flag is set on the socket, then we hold
off allocating more slots until we get a ->write_space() callback which
clears that flag.
For the RDMA case, we can continue to use the current system of a fixed
number of preallocated slots.
I take it then that we'd want a similar scheme for UDP as well? I guess
I'm just not sure what the slot table is supposed to be for.
[andros] I look at the rpc_slot table as a representation of the amount of data the connection to the server
can handle - basically the #slots should = double the bandwidth-delay product divided by the max(rsize/wsize).
For TCP, this is the window size. (ping of max MTU packet * interface bandwidth).
There is no reason to allocate more rpc_rqsts that can fit on the wire.
I agree with checking for space on the link.
The above formula is a good lower bound on the maximum number of slots,
but there are many times when a client could use more slots than the
above formula. For example, we don't want to punish writes if rsize>
wsize. Also, you have to account for the server memory, which can
sometimes hold several write requests while waiting for them to be
sync'd to disk, leaving the TCP buffers less than full.
Err... No... On the contrary, it is a good _upper_ bound on the number
of slots. There is no point in allocating a slot for an RPC request
which you know you have no ability to transmit. That has nothing to do
with rsize or wsize values: if the socket is backed up, it won't take
more data.
Absolutely, I'm just trying to point out that checking the
SOCK_ASYNC_NOSPACE flag seems to be the only way to guarantee it won't
take more data.
Dean
Trond
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html