> On 22 Sep 2021, at 10:01, Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > On Thu, Sep 16, 2021 at 03:34:46PM -0300, Jason Gunthorpe wrote: >> The FSM can run in a circle allowing rdma_resolve_ip() to be called twice >> on the same id_priv. While this cannot happen without going through the >> work, it violates the invariant that the same address resolution >> background request cannot be active twice. >> >> CPU 1 CPU 2 >> >> rdma_resolve_addr(): >> RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY >> rdma_resolve_ip(addr_handler) #1 >> >> process_one_req(): for #1 >> addr_handler(): >> RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND >> mutex_unlock(&id_priv->handler_mutex); >> [.. handler still running ..] >> >> rdma_resolve_addr(): >> RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY >> rdma_resolve_ip(addr_handler) >> !! two requests are now on the req_list >> >> rdma_destroy_id(): >> destroy_id_handler_unlock(): >> _destroy_id(): >> cma_cancel_operation(): >> rdma_addr_cancel() >> >> // process_one_req() self removes it >> spin_lock_bh(&lock); >> cancel_delayed_work(&req->work); >> if (!list_empty(&req->list)) == true >> >> ! rdma_addr_cancel() returns after process_on_req #1 is done >> >> kfree(id_priv) >> >> process_one_req(): for #2 >> addr_handler(): >> mutex_lock(&id_priv->handler_mutex); >> !! Use after free on id_priv >> >> rdma_addr_cancel() expects there to be one req on the list and only >> cancels the first one. The self-removal behavior of the work only happens >> after the handler has returned. This yields a situations where the >> req_list can have two reqs for the same "handle" but rdma_addr_cancel() >> only cancels the first one. >> >> The second req remains active beyond rdma_destroy_id() and will >> use-after-free id_priv once it inevitably triggers. >> >> Fix this by remembering if the id_priv has called rdma_resolve_ip() and >> always cancel before calling it again. This ensures the req_list never >> gets more than one item in it and doesn't cost anything in the normal flow >> that never uses this strange error path. >> >> Cc: stable@xxxxxxxxxxxxxxx >> Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager") >> Reported-by: syzbot+dc3dfba010d7671e05f5@xxxxxxxxxxxxxxxxxxxxxxxxx >> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> >> --- >> drivers/infiniband/core/cma.c | 17 +++++++++++++++++ >> drivers/infiniband/core/cma_priv.h | 1 + >> 2 files changed, 18 insertions(+) >> >> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c >> index c40791baced588..751cf5ea25f296 100644 >> --- a/drivers/infiniband/core/cma.c >> +++ b/drivers/infiniband/core/cma.c >> @@ -1776,6 +1776,14 @@ static void cma_cancel_operation(struct rdma_id_private *id_priv, >> { >> switch (state) { >> case RDMA_CM_ADDR_QUERY: >> + /* >> + * We can avoid doing the rdma_addr_cancel() based on state, >> + * only RDMA_CM_ADDR_QUERY has a work that could still execute. >> + * Notice that the addr_handler work could still be exiting >> + * outside this state, however due to the interaction with the >> + * handler_mutex the work is guaranteed not to touch id_priv >> + * during exit. >> + */ >> rdma_addr_cancel(&id_priv->id.route.addr.dev_addr); >> break; >> case RDMA_CM_ROUTE_QUERY: >> @@ -3413,6 +3421,15 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, >> if (dst_addr->sa_family == AF_IB) { >> ret = cma_resolve_ib_addr(id_priv); >> } else { >> + /* The FSM can return back to RDMA_CM_ADDR_BOUND after >> + * rdma_resolve_ip() is called, eg through the error >> + * path in addr_handler. If this happens the existing >> + * request must be canceled before issuing a new one. >> + */ >> + if (id_priv->used_resolve_ip) >> + rdma_addr_cancel(&id->route.addr.dev_addr); >> + else >> + id_priv->used_resolve_ip = 1; > > Why don't you never clear this field? If you assume that this is one lifetime > event, can you please add a comment with an explanation "why"? Adding to that, don't you need {READ,WRITE}_ONCE when accessing used_resolve_ip? Or will the write to it obtain global visibility because mutex_unlock(&ctx->mutex) is executed before any other context can read it? Thxs, Håkon > > Thanks > >> ret = rdma_resolve_ip(cma_src_addr(id_priv), dst_addr, >> &id->route.addr.dev_addr, >> timeout_ms, addr_handler, >> diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h >> index 5c463da9984536..f92f101ea9818f 100644 >> --- a/drivers/infiniband/core/cma_priv.h >> +++ b/drivers/infiniband/core/cma_priv.h >> @@ -91,6 +91,7 @@ struct rdma_id_private { >> u8 afonly; >> u8 timeout; >> u8 min_rnr_timer; >> + u8 used_resolve_ip; >> enum ib_gid_type gid_type; >> >> /* >> >> base-commit: ad17bbef3dd573da937816edc0ab84fed6a17fa6 >> -- >> 2.33.0