> On Jun 26, 2020, at 11:10 AM, Dan Aloni <dan@xxxxxxxxxxxx> wrote: > > On Fri, Jun 26, 2020 at 08:56:41AM -0400, Chuck Lever wrote: >>> On Jun 26, 2020, at 3:10 AM, Dan Aloni <dan@xxxxxxxxxxxx> wrote: > [..] >>> - Add a mutex in `rpcrdma_ep_destroy` to guard against concurrent calls >>> to `rpcrdma_xprt_disconnect` coming from either `rpcrdma_xprt_connect` >>> or `xprt_rdma_close`. >> >> NAK. The RPC client provides appropriate exclusion, please let's not >> add more serialization that can introduce further deadlocks. > > It appeared to me that this exclusion does not works well. As for my > considerations, if I am not mistaken from analyzing crashes I've > seen: > > -> xprt_autoclose (running on xprtiod) > -> xprt->ops->close > -> xprt_rdma_close > -> rpcrdma_xprt_disconnect > > and: > > -> xprt_rdma_connect_worker (running on xprtiod) > -> rpcrdma_xprt_connect > -> rpcrdma_xprt_disconnect > > I understand the rationale or at least the aim that `close` and > `connect` ops should not be concurrent on the same `xprt`, however: > > * `xprt_force_disconnect`, is called from various places, queues > a call to `xprt_autoclose` to the background on `xprtiod` workqueue item, > conditioned that `!XPRT_LOCKED` which is the case for connect that went > to the background. > * `xprt_rdma_connect` also sends `xprt_rdma_connect_worker` as an `xprtiod` > workqueue item, unconditionally. > > So we have two work items that can run in parallel, and I don't see > clear gating on this from the code. If close and connect are being called concurrently on the same xprt, then there is a bug in the generic RPC xprt code. I don't believe that to be the case here. If xprtrdma invokes force_disconnect during connect processing, XPRT_LOCKED should be held and the close should be delayed. > Maybe there's a simpler fix for this. Perhaps a > `cancel_delayed_work_sync(&r_xprt->rx_connect_worker);` is appropriate > in `xprt_rdma_close`? There are simpler, less deadlock-prone, and more justifiable fixes. Please stand by, I will take care of this today. -- Chuck Lever