On Mon, 7 Apr 2008 16:50:27 -0400 "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > On Mon, Apr 07, 2008 at 04:22:41PM -0400, Jeff Layton wrote: > > On Mon, 7 Apr 2008 13:56:15 -0400 > > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > > > > > On Mon, Apr 07, 2008 at 12:45:01PM -0400, Christoph Hellwig wrote: > > > > On Mon, Apr 07, 2008 at 09:38:34AM -0400, Jeff Layton wrote: > > > > > The global task and serv pointers for lockd are normally protected by > > > > > the nlmsvc_mutex. The exception is when the lockd exits abnormally. When > > > > > this occurs, these variables are cleared without any locking. > > > > > > > > Shouldn't we get rid of the case where it exits abnormally instead? > > > > > > I tried to figure out when this could actually occur (when can > > > svc_recv() return an error other than -EINTR or -EAGAIN?), and got lost > > > in sock_recvmsg(): > > > > > > - svc_recv() itself returns only -EAGAIN or the return from > > > ->xpo_recvfrom(). > > > - the only xpo_recvfrom() that's interesting is > > > svc_tcp_recvfrom(), which can return the error it gets from > > > svc_recvfrom(), which can return the error from > > > kernel_recvmsg(), which gets its return from sock_recvmsg(). > > > > > > Since __sock_recvmsg() has a security hook, it looks like we can end up > > > with an -EACCES from selinux? > > > > > > So one case would be selinux deciding we weren't allowed to receive > > > packets from this socket. Huh. > > > > I got lost there too, but I would suspect that there are other errors > > that can bubble up from the lower networking layers as well. Even if > > there aren't currently, it's probably still prudent to assume that it's > > a possibility and code for it. > > > > I tend to think the safest thing is probably to do a long sleep (1s or > > so and retry when we get an error (maybe also a ratelimited printk?). > > Yeah, I guess I can't think of anything better. > Ok, I went ahead and did patches for this and gave them a quick test this morning. Obviously, these are hard to fully unit test since this seems to be a very uncommon occurrence. Any thoughts? -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html