On Thu, 2008-10-23 at 20:18 -0500, Steve French wrote: > On Thu, Oct 23, 2008 at 3:59 PM, Evgeniy Polyakov <zbr@xxxxxxxxxxx> wrote: > > On Thu, Oct 23, 2008 at 02:40:01PM -0500, Steve French (smfrench@xxxxxxxxx) wrote: > >> I think we already do the same thing as NFS, they still are turning > >> off autotuning on the client right? > >> > >> If we could set a "sk_min_sndbuf_size" someday (to e.g. twice the size > >> of the cifs write frames ie about 112K) - would that be enough? > > > > > > You do not need to set sockt buffers, but instead read data in chunks, > > which will automatically make a TCP progress. > > If that is the case (ie that cifs and nfs never need to set these over > tcp - I am still having trouble reconciling that with the NFS guys' > comments that they must set rcvbuf (and Jim's comment below) > > The other issue is that at least for NFS, the receive buffer must be big > > enough to hold the biggest possible rpc. If not, a partial rpc will get > > stuck in the buffer and no progress will be made. Jim & co are talking about the _server_ side, which has very different requirements when compared to a client. One of the NFS kernel server's main tasks is to manage its own resources, and for that reason one of the design constraints is that it only starts reading the next request from a socket when it knows that it has enough resources to send a reply without blocking. Look rather at the NFS client: that uses non-blocking modes together with an aio engine (a.k.a. rpciod) to shove stuff as quickly as possible out of the socket's skbufs and into the page cache/inode metadata caches. No locking of TCP socket buffer sizes needed or used... Trond -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html