On Thu, Oct 23, 2008 at 2:35 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > On Thu, 23 Oct 2008 14:19:04 -0500 > "Steve French" <smfrench@xxxxxxxxx> wrote: >> > His statement was basically "unless you know for sure that you don't >> > want to use more than X amount of memory, then there isn't much reason >> > to set the send and receive buffers". >> >> I think that there is a problem still - cifs needs tcp autotuning but >> the buffers probably need to be at least as big as the largest SMB >> response (about 17K on receives, but configurable to be larger). See >> comment below about NFS: >> > > On Thu, Oct 23, 2008 at 2:05 PM, Jim Rees <rees@xxxxxxxxx> wrote: >> There are two issues to be aware of. One is that the socket buffers have to >> be big enough for the tcp congestion window. In the old days, the >> application would have to know ahead of time how big this is, and call >> setsockopt(), which sets these numbers. >> >> Now however, the tcp stack "autotunes" the buffer sizes to the correct >> amount. If you call setsockopt() to set a buffer size, or set sk_*buf, or >> set the SOCK_*BUF_LOCK bits in sk_userlocks, you defeat this autotuning. >> This is almost always a bad idea. >> >> The other issue is that at least for NFS, the receive buffer must be big >> enough to hold the biggest possible rpc. If not, a partial rpc will get >> stuck in the buffer and no progress will be made. >> >> Issue one is easy to deal with, just don't muck with the socket internal >> data structure. The second one is harder. What's really needed is a new >> api into the tcp layer that will reserve a minimum amount of memory in the >> socket buffer so that receives won't stall. For now, our patch sets the >> sk_*buf values without setting the lock flags. > > > Agreed. Thanks Jim, for sending along that info... > > It sounds like what we should do for CIFS is the same thing. Set the > buffer sizes but make sure the SOCK_*BUF_LOCK bits are cleared so that > they can grow as needed. CIFS, unlike NFS, never set the SOCK_*BUF_LOCK flags. I think we already do the same thing as NFS, they still are turning off autotuning on the client right? If we could set a "sk_min_sndbuf_size" someday (to e.g. twice the size of the cifs write frames ie about 112K) - would that be enough? -- Thanks, Steve -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html