Re: [PATCH v3] rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 06, 2024 at 06:09:51PM +0300, Sagi Grimberg wrote:
> 
> On 06/05/2024 12:37, Dan Aloni wrote:
> > Under the scenario of IB device bonding, when bringing down one of the
> > ports, or all ports, we saw xprtrdma entering a non-recoverable state
> > where it is not even possible to complete the disconnect and shut it
> > down the mount, requiring a reboot. Following debug, we saw that
> > transport connect never ended after receiving the
> > RDMA_CM_EVENT_DEVICE_REMOVAL callback.
> > 
> > The DEVICE_REMOVAL callback is irrespective of whether the CM_ID is
> > connected, and ESTABLISHED may not have happened. So need to work with
> > each of these states accordingly.
> > 
> > Fixes: 2acc5cae2923 ('xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed')
> 
> Is this actually the offending commit ?
> 
> commit bebd031866ca ("xprtrdma: Support unplugging an HCA from under an NFS
> mount")
> is the one assuming DEVICE_REMOVAL triggers a disconnect not accounting that
> the
> cm_id may not be ESTABLISHED (where we need to wake the connect waiter?

I'd be OK with discussing possible culprits in the patch description
but leaving off a Fixes: tag for now.

It would be reasonable to demand that the proposed fix be applied
to each LTS kernel and tested individually to ensure there are no
side-effects or pre-requisites.


> Question though, in DEVICE_REMOVAL the device is going away as soon as the
> cm handler callback returns. Shouldn't nfs release all the device resources
> (related to this
> cm_id)? afaict it was changed in:
> e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")

In the case where a DEVICE_REMOVAL event fires and a connection
hasn't yet been established, my guess is the ep reference count will
go to zero when rpcrdma_ep_put() is called.


> The patch itself looks reasonable (although I do think that the rdma stack
> expects the
> ulp to have the rdma resources released when the callback returns).

Thanks for the review! Should we add:

Reviewed-by: Sagi Grimberg <sagi@xxxxxxxxxxx>


> FWIW in nvme we avoided the problem altogether by registering an ib_client
> that is
> called on .remove() and its a separate context that doesn't have all the
> intricacies with
> rdma_cm...

I looked at ib_client, years ago, and thought it would be a lot of
added complexity. With a code sample (NVMe host) maybe I can put
something together.


-- 
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux