On Mon, 7 Apr 2008 12:45:01 -0400 Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > On Mon, Apr 07, 2008 at 09:38:34AM -0400, Jeff Layton wrote: > > The global task and serv pointers for lockd are normally protected by > > the nlmsvc_mutex. The exception is when the lockd exits abnormally. When > > this occurs, these variables are cleared without any locking. > > Shouldn't we get rid of the case where it exits abnormally instead? > Not a bad idea. After chatting with Christoph a bit on IRC, I suppose we have 2 options if we want to pursue this. When we get an unexpected error from svc_recv(), we could: 1) sleep for a bit and then retry 2) call schedule() and sleep until kthread_stop shuts down the thread I think #1 is probably the best option. It's certainly the more fault tolerant. That also fixes another potential problem -- right now if the thread exits and the nlmsvc_users count isn't 0, then we can potentially BUG() on the next lockd_up/lockd_down. Any thoughts on what an appropriate sleep timeout should be when this happens? I was thinking 1s or so... Trond, Bruce, any thoughts? -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html