On Thu, 17 Sep 2015 09:38:33 -0400 Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote: > On Tue, Sep 15, 2015 at 2:52 PM, Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote: > > On Tue, 15 Sep 2015 16:49:23 +0100 > > "Suzuki K. Poulose" <suzuki.poulose@xxxxxxx> wrote: > > > >> net/sunrpc/xprtsock.c | 9 ++++++++- > >> 1 file changed, 8 insertions(+), 1 deletion(-) > >> > >> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > >> index 7be90bc..6f4789d 100644 > >> --- a/net/sunrpc/xprtsock.c > >> +++ b/net/sunrpc/xprtsock.c > >> @@ -822,9 +822,16 @@ static void xs_reset_transport(struct sock_xprt *transport) > >> if (atomic_read(&transport->xprt.swapper)) > >> sk_clear_memalloc(sk); > >> > >> - kernel_sock_shutdown(sock, SHUT_RDWR); > >> + if (sock) > >> + kernel_sock_shutdown(sock, SHUT_RDWR); > >> > > > > Good catch, but...isn't this still racy? What prevents transport->sock > > being set to NULL after you assign it to "sock" but before calling > > kernel_sock_shutdown? > > The XPRT_LOCKED state. > IDGI -- if the XPRT_LOCKED bit was supposed to prevent that, then how could you hit the original race? There should be no concurrent callers to xs_reset_transport on the same xprt, right? AFAICT, that bit is not set in the xprt_destroy codepath, which may be the root cause of the problem. How would we take it there anyway? xprt_destroy is void return, and may not be called in the context of a rpc_task. If it's contended, what do we do? Sleep until it's cleared? -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html