On Mon, Jun 12, 2023 at 02:42:37PM +0900, Shin'ichiro Kawasaki wrote: > When rdma_destroy_id() and cma_iw_handler() race, struct rdma_id_private > *id_priv can be destroyed during cma_iw_handler call. This causes "BUG: > KASAN: slab-use-after-free" at mutex_lock() in cma_iw_handler() [1]. > To prevent the destroy of id_priv, keep its reference count by calling > cma_id_get() and cma_id_put() at start and end of cma_iw_handler(). > > [1] > > ================================================================== > BUG: KASAN: slab-use-after-free in __mutex_lock+0x1324/0x18f0 > Read of size 8 at addr ffff888197b37418 by task kworker/u8:0/9 > > CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0 #62 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 > Workqueue: iw_cm_wq cm_work_handler [iw_cm] > Call Trace: > <TASK> > dump_stack_lvl+0x57/0x90 > print_report+0xcf/0x660 > ? __mutex_lock+0x1324/0x18f0 > kasan_report+0xa4/0xe0 > ? __mutex_lock+0x1324/0x18f0 > __mutex_lock+0x1324/0x18f0 > ? cma_iw_handler+0xac/0x4f0 [rdma_cm] > ? _raw_spin_unlock_irqrestore+0x30/0x60 > ? rcu_is_watching+0x11/0xb0 > ? _raw_spin_unlock_irqrestore+0x30/0x60 > ? trace_hardirqs_on+0x12/0x100 > ? __pfx___mutex_lock+0x10/0x10 > ? __percpu_counter_sum+0x147/0x1e0 > ? domain_dirty_limits+0x246/0x390 > ? wb_over_bg_thresh+0x4d5/0x610 > ? rcu_is_watching+0x11/0xb0 > ? cma_iw_handler+0xac/0x4f0 [rdma_cm] > cma_iw_handler+0xac/0x4f0 [rdma_cm] What is the full call chain here, eg with the static functions un-inlined? > > drivers/infiniband/core/cma.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 93a1c48d0c32..c5267d9bb184 100644 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -2477,6 +2477,7 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event) > struct sockaddr *laddr = (struct sockaddr *)&iw_event->local_addr; > struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr; > > + cma_id_get(id_priv); > mutex_lock(&id_priv->handler_mutex); > if (READ_ONCE(id_priv->state) != RDMA_CM_CONNECT) > goto out; > @@ -2524,12 +2525,14 @@ static int cma_iw_handler(struct iw_cm_id *iw_id, struct iw_cm_event *iw_event) > if (ret) { > /* Destroy the CM ID by returning a non-zero value. */ > id_priv->cm_id.iw = NULL; > + cma_id_put(id_priv); > destroy_id_handler_unlock(id_priv); > return ret; > } > > out: > mutex_unlock(&id_priv->handler_mutex); > + cma_id_put(id_priv); > return ret; > } cm_work_handler already has a ref on the iwcm_id_private I think there is likely some much larger issue with the IW CM if the cm_id can be destroyed while the iwcm_id is in use? It is weird that there are two id memories for this :\ Jason