On Thu, 23 Oct 2008 11:42:49 -0500 "Steve French" <smfrench@xxxxxxxxx> wrote: > On Thu, Oct 23, 2008 at 11:26 AM, Shirish Pargaonkar > <shirishpargaonkar@xxxxxxxxx> wrote: > > On Thu, Oct 23, 2008 at 9:54 AM, Steve French <smfrench@xxxxxxxxx> wrote: > >>> The big question is what happens under heavy memory pressure. In that > >>> case we may not be able to copy the data from the buffer or iov into > >>> the skbuf. With non-blocking I/O kernel_sendmsg will return error > >>> if nothing was copied (probably -EAGAIN or -ENOSPC). If some of the > >>> data could be copied, then it'll return a positive value that was less > >>> than the size of the send. > I think we are ok, as long as in all paths we either: > a) resend the remainder of the SMB (by retrying what is left in the buffer > or > b) kill the tcp session, reconnect (and either retry with a new handle > or return a EHOSTDOWN error to the vfs depending on the path) > > >> > >>> Regardless of whether we use blocking I/O or not, kernel_sendmsg could > >>> still copy less data than was requested. The patch needs to account > >>> for this, probably by retrying the send with the remaining data if it > >>> occurs. > >> Yes - if that is true we need to retry at least once on this. > >> > >>> I don't think Shirish's patch is likely fixing this problem, but is > >>> probably just making it harder for it to occur. > >>> > >>> I think, overall that blocking I/O makes sense. It should make error > >>> handling easier and I don't see any real disadvantage. We might > >>> consider using longer timeouts, removing the MSG_NOSIGNAL flag and > >>> changing the code to deal with the possibility of catching a signal > >>> during the kernel_sendmsg call. > >>> > >>> Also, we're setting the sk_sndbuf and sk_rcvbuf in ipv4_connect. > >>> According to Neil, that may be hurting us here. This has the effect of > >>> placing a hard limit on the send buffer for the socket, and may be > >>> making it more likely that we get an error back from kernel_sendmsg. > >> Yes - Sridhar and Dave Stevens mentioned that too. We haven't needed > >> that (setting sndbuf) for at least a few kernel versions (is the auto socket > >> buffer size tuning working as far back as RHEL5/2.6.18 and > >> SLES10 2.6.16 too?) > >> > >> Shirish, > >> I thought that that was part of your patch? > > > > No it is not but this is an easy change, just do not set them. > > As I look into this sndbuf and rcvbuf size setting ... what concerns > me is why nfs sets these sizes for snd and rcvbuf sizes still if they > don't need to be set? We (cifs) have larger write sizes (56K) than > nfs's default. See svc_set_sockbufsize in net/sunrpc/svcsock.c > AFAICT, those size settings have been there for a long time (2002?). My guess is that this may predate the socket autotuning stuff that Neil H. mentioned to me. His statement was basically "unless you know for sure that you don't want to use more than X amount of memory, then there isn't much reason to set the send and receive buffers". -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html