Re: [PATCH 1/4] SUNRPC: Fix races when closing the socket

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 29, 2021 at 4:11 PM <trondmy@xxxxxxxxxx> wrote:
>
> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
>
> Ensure that we bump the xprt->connect_cookie when we set the
> XPRT_CLOSE_WAIT flag so that another call to
> xprt_conditional_disconnect() won't race with the reconnection.
>
> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> ---
>  net/sunrpc/xprt.c     | 2 ++
>  net/sunrpc/xprtsock.c | 1 +
>  2 files changed, 3 insertions(+)
>
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 48560188e84d..691fe5a682b6 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -735,6 +735,8 @@ static void xprt_autoclose(struct work_struct *work)
>         unsigned int pflags = memalloc_nofs_save();
>
>         trace_xprt_disconnect_auto(xprt);
> +       xprt->connect_cookie++;
> +       smp_mb__before_atomic();
>         clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
>         xprt->ops->close(xprt);
>         xprt_release_write(xprt, NULL);
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 04f1b78bcbca..b18d13479104 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -1134,6 +1134,7 @@ static void xs_run_error_worker(struct sock_xprt *transport, unsigned int nr)
>
>  static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
>  {
> +       xprt->connect_cookie++;
>         smp_mb__before_atomic();
>         clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
>         clear_bit(XPRT_CLOSING, &xprt->state);
> --
> 2.31.1
>

Hey Trond,

Are you working on this "double SYN" problem that Olga / Netapp found
or something else?

FWIW, late yesterday, I tested these patches on top of your "testing"
branch and still see "double SYNs".  I have a simple reproducer I'm
using where I fire off a bunch of "flocks" in parallel then reboot the
server.

If I can help with testing or something similar, let me know.  I also
noticed that the tracepoints that would be useful for these reconnect
problems do not have 'src' TCP port in them, which would be helpful.
I don't have a patch for that yet but started looking into it.




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux