On 4/26/2021 9:56 PM, Jason Gunthorpe wrote:
On Sat, Apr 24, 2021 at 10:33:13AM +0800, Mark Zhang wrote:
Set reverse call chains:
cm_init_av_for_lap()
cm_lap_handler(work) (ok)
cm_init_av_for_response()
cm_req_handler(work) (OK, cm_id_priv is on stack)
cm_sidr_req_handler(work) (OK, cm_id_priv is on stack)
cm_init_av_by_path()
cm_req_handler(work) (OK, cm_id_priv is on stack)
cm_lap_handler(work) (OK)
ib_send_cm_req() (not locked)
cma_connect_ib()
rdma_connect_locked()
[..]
ipoib_cm_send_req()
srp_send_req()
srp_connect_ch()
[..]
ib_send_cm_sidr_req() (not locked)
cma_resolve_ib_udp()
rdma_connect_locked()
Both cm_init_av_for_lap()
Well, it is wrong today, look at cm_lap_handler():
spin_lock_irq(&cm_id_priv->lock);
[..]
ret = cm_init_av_for_lap(work->port, work->mad_recv_wc->wc,
work->mad_recv_wc->recv_buf.grh,
&cm_id_priv->av);
[..]
cm_queue_work_unlock(cm_id_priv, work);
These need to be restructured, the sleeping calls to extract the
new_ah_attr have to be done before we go into the spinlock.
That is probably the general solution to all the cases, do some work
before the lock and then copy from the stack to the memory under the
spinlock.
Maybe we can call cm_set_av_port(av, port) outside of cm_init_av_*? So
that we can apply cm_id_priv->lock when needed.