> On 14 Apr 2020, at 18:11, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote: > > On Tue, Apr 14, 2020 at 03:57:20PM +0200, Håkon Bugge wrote: >> >> >>> On 14 Apr 2020, at 14:50, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote: >>> >>> On Tue, Apr 14, 2020 at 12:34:35PM +0200, Håkon Bugge wrote: >>> >>>>>>>> Shall I make a v2 base on next based on this idea, or do you have >>>>>>>> something coming? >>>>>>> >>>>>>> Sure, I have nothing :) >>>>>>> >>>>>>> Also that rdma_destroy_id in addr_handler looks wrong.. ie we still >>>>>>> retain pointers to the rdma_cm_id it destroys inside the struct >>>>>>> ucma_context, don't we? >>>>>> >>>>>> On entry from user-space, we use the u32 id and looks it up in the >>>>>> XArray. But if rdma_destoy_id() is called asynchronously called >>>>>> between ucma_get_ctx_dev() and the de-reference of ctx->cm_id, we >>>>>> are toast. >>>>> >>>>> Is that what happens on the addr_handler path? >>>> >>>> No, there, the main problem is the revert of the state >>>> transitions. The first transition enables rdma_resolve_route() to >>>> pass its gate (i.e. state == ADDR_RESOLVED). Then it thinks the >>>> address is resolved, but the addr_handler changes its mind >>>> afterwards. >>> >>> That is a problem, but the call to rdma_destroy_id looks like another >>> problem >> >> I am not sure. Almost all events/incoming packets, can, after the >> cm_id's event_handler is called from cma_ib_handler(), call >> rdma_destroy_id(). > > I think the trick is that ucma_event_handler never returns failure > unless RDMA_CM_EVENT_CONNECT_REQUEST, which means the cm_id isn't in > the xarray yet? Sure does. 1 or -ENOMEM. But the ULP's event handlers isn't that polite. But a different issue from this syzkaller one. >> I assume the refcounting takes care of this. > > There is no refcounting for destroy, it must be called once. I was thinking about the "cma_deref_id(id_priv);" stuff, but I may have misunderstood. Thxs, Håkon