[ add Tim and Arechiga ] Linus Torvalds wrote: > On Wed, Oct 19, 2022 at 6:35 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > A report from a tester with this call trace: > > > > watchdog: BUG: soft lockup - CPU#127 stuck for 134s! [ksoftirqd/127:782] > > RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [..] > > Whee. > > > ...lead me to this thread. This was after I had them force all softirqs > > to run in ksoftirqd context, and run with rq_affinity == 2 to force > > I/O completion work to throttle new submissions. > > > > Willy, are these headed upstream: > > > > https://lore.kernel.org/all/YjSbHp6B9a1G3tuQ@xxxxxxxxxxxxxxxxxxxx > > > > ...or I am missing an alternate solution posted elsewhere? > > Can your reporter test that patch? I think it should still apply > pretty much as-is.. And if we actually had somebody who had a > test-case that was literally fixed by getting rid of the old bookmark > code, that would make applying that patch a no-brainer. > > The problem is that the original load that caused us to do that thing > in the first place isn't repeatable because it was special production > code - so removing that bookmark code because we _think_ it now hurts > more than it helps is kind of a big hurdle. > > But if we had some hard confirmation from somebody that "yes, the > bookmark code is now hurting", that would make it a lot more palatable > to just remove the code that we just _think_ that probably isn't > needed any more.. Arechiga reports that his test case that failed "fast" before now ran for 28 hours without a soft lockup report with the proposed patches applied. So, I would consider those: Tested-by: Jesus Arechiga Lopez <jesus.a.arechiga.lopez@xxxxxxxxx> I notice that the original commit: 11a19c7b099f sched/wait: Introduce wakeup boomark in wake_up_page_bit ...was trying to fix waitqueue lock contention. The general approach of setting a bookmark and taking a break "could" work, but it in this case it would need to do something like return -EWOULDBLOCK and let ksoftirqd fall into its cond_resched() retry path. However, that would require plumbing the bookmark up several levels, not to mention the other folio_wake_bit() callers that do not have a convenient place to do cond_resched(). So I think has successfully found a way that waitqueue lock contention can not be improved.