Re: [PATCH v3] rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 06, 2024 at 12:37:59PM +0300, Dan Aloni wrote:
> Under the scenario of IB device bonding, when bringing down one of the
> ports, or all ports, we saw xprtrdma entering a non-recoverable state
> where it is not even possible to complete the disconnect and shut it
> down the mount, requiring a reboot. Following debug, we saw that
> transport connect never ended after receiving the
> RDMA_CM_EVENT_DEVICE_REMOVAL callback.
> 
> The DEVICE_REMOVAL callback is irrespective of whether the CM_ID is
> connected, and ESTABLISHED may not have happened. So need to work with
> each of these states accordingly.
> 
> Fixes: 2acc5cae2923 ('xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed')
> Cc: Sagi Grimberg <sagi.grimberg@xxxxxxxxxxxx>
> Signed-off-by: Dan Aloni <dan.aloni@xxxxxxxxxxxx>
> ---
>  net/sunrpc/xprtrdma/verbs.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
> index 4f8d7efa469f..432557a553e7 100644
> --- a/net/sunrpc/xprtrdma/verbs.c
> +++ b/net/sunrpc/xprtrdma/verbs.c
> @@ -244,7 +244,11 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event)
>  	case RDMA_CM_EVENT_DEVICE_REMOVAL:
>  		pr_info("rpcrdma: removing device %s for %pISpc\n",
>  			ep->re_id->device->name, sap);
> -		fallthrough;
> +		switch (xchg(&ep->re_connect_status, -ENODEV)) {
> +		case 0: goto wake_connect_worker;
> +		case 1: goto disconnected;
> +		}
> +		return 0;
>  	case RDMA_CM_EVENT_ADDR_CHANGE:
>  		ep->re_connect_status = -ENODEV;
>  		goto disconnected;
> -- 
> 2.39.3
> 

Hi Anna,

Please apply this patch with:

Reviewed-by: Sagi Grimberg <sagi@xxxxxxxxxxx>
Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx>


-- 
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux