Re: [bug report] WARNING: possible circular locking at: rdma_destroy_id+0x17/0x20 [rdma_cm] triggered by blktests nvmeof-mp/002

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/26/22 6:03 PM, yangx.jy@xxxxxxxxxxx wrote:
On 2022/8/25 14:26, Guoqing Jiang wrote:

On 8/25/22 1:59 PM, yangx.jy@xxxxxxxxxxx wrote:
On 2022/5/25 19:01, Sagi Grimberg wrote:
iirc this was reported before, based on my analysis lockdep is giving
a false alarm here. The reason is that the id_priv->handler_mutex cannot
be the same for both cm_id that is handling the connect and the cm_id
that is handling the rdma_destroy_id because rdma_destroy_id call
is always called on a already disconnected cm_id, so this deadlock
lockdep is complaining about cannot happen.
Hi Jason, Bart and Sagi,

I also think it is actually a false positive.  The cm_id handling the
connection and the cm_id calling rdma_destroy_id() cannot be the same
one, right?
I am wondering if it is the same as the thread.

https://lore.kernel.org/linux-rdma/CAMGffEm22sP-oKK0D9=vOw77nbS05iwG7MC3DTVB0CyzVFhtXg@xxxxxxxxxxxxxx/
Hi Guoqing,

Thanks for your feedback.

I think they are the same deadlock issue (i.e. AB vs BCA).  The only
difference is that two combinations of locks caused the same issue.

It seems that one id_priv->handler_mutex is locked on the new-created
cm_id and the other id_priv->handler_mutex is locked on the disconnected
cm_id.

I'm not sure how to settle this.
Do you have any suggestion to remove the false positive by refactoring
the related RDMA/CM code. Sorry, I didn't know how to do it for now.
The simplest way is to call lockdep_off in case it is false alarm to
avoid the
debugging effort, but not everyone likes the idea.

https://elixir.bootlin.com/linux/v6.0-rc2/C/ident/lockdep_off
To be honest, I don't like the fix way as well. I wonder if we can avoid
the false positive by changing the related RDMA/CM code.

I would consider it is a workaround before CM code is changed (and it needs
more effort I guess hopefully I am wrong), otherwise different people would
post the similar issue to list again.

Thanks,
Guoqing



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux