On Sat, Apr 24, 2021 at 10:33:13AM +0800, Mark Zhang wrote: > > > > Set reverse call chains: > > > > cm_init_av_for_lap() > > cm_lap_handler(work) (ok) > > > > cm_init_av_for_response() > > cm_req_handler(work) (OK, cm_id_priv is on stack) > > cm_sidr_req_handler(work) (OK, cm_id_priv is on stack) > > > > cm_init_av_by_path() > > cm_req_handler(work) (OK, cm_id_priv is on stack) > > cm_lap_handler(work) (OK) > > ib_send_cm_req() (not locked) > > cma_connect_ib() > > rdma_connect_locked() > > [..] > > ipoib_cm_send_req() > > srp_send_req() > > srp_connect_ch() > > [..] > > ib_send_cm_sidr_req() (not locked) > > cma_resolve_ib_udp() > > rdma_connect_locked() > > > > Both cm_init_av_for_lap() Well, it is wrong today, look at cm_lap_handler(): spin_lock_irq(&cm_id_priv->lock); [..] ret = cm_init_av_for_lap(work->port, work->mad_recv_wc->wc, work->mad_recv_wc->recv_buf.grh, &cm_id_priv->av); [..] cm_queue_work_unlock(cm_id_priv, work); These need to be restructured, the sleeping calls to extract the new_ah_attr have to be done before we go into the spinlock. That is probably the general solution to all the cases, do some work before the lock and then copy from the stack to the memory under the spinlock. Jason