On 4/12/22 12:28, john.p.donnelly@xxxxxxxxxx wrote:
On 4/11/22 4:07 PM, Waiman Long wrote:
On 4/11/22 17:03, john.p.donnelly@xxxxxxxxxx wrote:
I have reached out to Waiman and he suggested this for our next
test pass:
1ee326196c6658 locking/rwsem: Always try to wake waiters in
out_nolock path
Does this commit help to avoid the lockup problem?
Commit 1ee326196c6658 fixes a potential missed wakeup problem when
a reader first in the wait queue is interrupted out without
acquiring the lock. It is actually not a fix for commit
d257cc8cb8d5. However, this commit changes the out_nolock path
behavior of writers by leaving the handoff bit set when the wait
queue isn't empty. That likely makes the missed wakeup problem
easier to reproduce.
Cheers,
Longman
Hi,
We are testing now
ETA for fio soak test completion is ~15hr from now.
I wanted to share the stack traces for future reference + occurrences.
I am looking forward to your testing results tomorrow.
Cheers,
Longman
Hi
Our 24hr fio soak test with :
1ee326196c6658 locking/rwsem: Always try to wake waiters in
out_nolock path
applied to 5.15.30 passed.
I suggest you append 1ee326196c6658 with :
cc: stable
Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent")
I'll leave the implementation details up to the core maintainers how
to do that ;-)
Thanks for the test.
The patch has already been in the tip tree. It may not be easy to add a
Fixes tag to it. Anyway, I will encourage stable tree maintainer to take
it as it does fix a problem as shown in your test.
Cheers,
Longman