On Fri, Feb 02, 2024 at 07:53:13PM -0800, Shifeng Li wrote: > The mad_client will be initialized in enable_device_and_get(), while the > devices_rwsem will be downgraded to a read semaphore. There is a window > that leads to the failed initialization for cm_client, since it can not > get matched mad port from ib_mad_port_list, and the matched mad port will > be added to the list after that. > > mad_client | cm_client > ------------------|-------------------------------------------------------- > ib_register_device| > enable_device_and_get > down_write(&devices_rwsem) > xa_set_mark(&devices, DEVICE_REGISTERED) > downgrade_write(&devices_rwsem) > | > |ib_cm_init > |ib_register_client(&cm_client) > |down_read(&devices_rwsem) > |xa_for_each_marked (&devices, DEVICE_REGISTERED) > |add_client_context > |cm_add_one > |ib_register_mad_agent > |ib_get_mad_port > |__ib_get_mad_port > |list_for_each_entry(entry, &ib_mad_port_list, port_list) > |return NULL > |up_read(&devices_rwsem) > | > add_client_context| > ib_mad_init_device| > ib_mad_port_open | > list_add_tail(&port_priv->port_list, &ib_mad_port_list) > up_read(&devices_rwsem) > | > > Fix it by using down_write(&devices_rwsem) in ib_register_client(). > > Fixes: d0899892edd0 ("RDMA/device: Provide APIs from the core code to help unregistration") > Suggested-by: Jason Gunthorpe <jgg@xxxxxxxx> > Cc: Ding Hui <dinghui@xxxxxxxxxxxxxx> > Cc: Shifeng Li <lishifeng1992@xxxxxxx> > Signed-off-by: Shifeng Li <lishifeng@xxxxxxxxxxxxxx> > --- > drivers/infiniband/core/device.c | 33 +++++++++++++++++--------------- > 1 file changed, 18 insertions(+), 15 deletions(-) Applied to for-next, thanks Jason