My belief is that the issue is related to timing not the logical operation of the code. Work queues are just kernel processes and can be scheduled (if not holding spinlocks) while soft IRQs lock up the CPU until they exit. This can cause longer delays in responding to ULPs. The work queue tasks for each QP are strictly single threaded which is managed by the work queue framework the same as tasklets. The other evidence ofthis -----Original Message----- From: Zhu Yanjun <yanjun.zhu@xxxxxxxxx> Sent: Tuesday, September 19, 2023 3:07 AM To: Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> Cc: Bob Pearson <rpearsonhpe@xxxxxxxxx>; Bart Van Assche <bvanassche@xxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx Subject: Re: [bug report] blktests srp/002 hang 在 2023/9/19 12:14, Shinichiro Kawasaki 写道: > On Sep 16, 2023 / 13:59, Zhu Yanjun wrote: > [...] >> On Debian, with the latest multipathd or revert the commit >> 9b4b7c1f9f54 >> ("RDMA/rxe: Add workqueue support for rxe tasks"), this problem will >> disappear. > > Zhu, thank you for the actions. > >> On Fedora 38, if the commit 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue >> support for rxe tasks") is reverted, will this problem still appear? >> I do not have such test environment. The commit is in the attachment, >> can anyone have a test? Please let us know the test result. Thanks. > > I tried the latest kernel tag v6.6-rc2 with my Fedora 38 test systems. > With the > v6.6-rc2 kernel, I still see the hang. I repeated the blktests test > case srp/002 > 30 time or so, then the hang was recreated. Then I reverted the commit > 9b4b7c1f9f54 from v6.6-rc2, and the hang disappeared. I repeated the > blktests test case 100 times, and did not see the hang. > > I confirmed these results under two multipathd conditions: 1) with > Fedora latest device-mapper-multipath package v0.9.4, and 2) the > latest multipath-tools v0.9.6 that I built from source code. > > So, when the commit gets reverted, the hang disappears as I reported > for v6.5-rcX kernels. Thanks, Shinichiro Kawasaki. Your helps are appreciated. This problem is related with the followings: 1). Linux distributions: Ubuntu, Debian and Fedora; 2). multipathd; 3). the commits 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue support for rxe tasks") On Ubuntu, with or without the commit, this problem does not occur. On Debian, without this commit, this problem does not occur. With this commit, this problem will occur. On Fedora, without this commit, this problem does not occur. With this commit, this problem will occur. The commits 9b4b7c1f9f54 ("RDMA/rxe: Add workqueue support for rxe tasks") is from Bob Pearson. Hi, Bob, do you have any comments about this problem? It seems that this commit is not compatible with blktests. Hi, Jason and Leon, please comment on this problem. Thanks a lot. Zhu Yanjun