Re: writeback completion soft lockup BUG in folio_wake_bit()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 2022-10-23 at 15:38 -0700, Linus Torvalds wrote:
> On Wed, Oct 19, 2022 at 6:35 PM Dan Williams
> <dan.j.williams@xxxxxxxxx> wrote:
> > 
> > A report from a tester with this call trace:
> > 
> >  watchdog: BUG: soft lockup - CPU#127 stuck for 134s!
> > [ksoftirqd/127:782]
> >  RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [..]
> 
> Whee.
> 
> > ...lead me to this thread. This was after I had them force all
> > softirqs
> > to run in ksoftirqd context, and run with rq_affinity == 2 to force
> > I/O completion work to throttle new submissions.
> > 
> > Willy, are these headed upstream:
> > 
> > https://lore.kernel.org/all/YjSbHp6B9a1G3tuQ@xxxxxxxxxxxxxxxxxxxx
> > 
> > ...or I am missing an alternate solution posted elsewhere?
> 
> Can your reporter test that patch? I think it should still apply
> pretty much as-is.. And if we actually had somebody who had a
> test-case that was literally fixed by getting rid of the old bookmark
> code, that would make applying that patch a no-brainer.
> 
> The problem is that the original load that caused us to do that thing
> in the first place isn't repeatable because it was special production
> code - so removing that bookmark code because we _think_ it now hurts
> more than it helps is kind of a big hurdle.
> 
> But if we had some hard confirmation from somebody that "yes, the
> bookmark code is now hurting", that would make it a lot more
> palatable
> to just remove the code that we just _think_ that probably isn't
> needed any more..
> 
> 
I do think that the original locked page on migration problem was fixed
by commit 9a1ea439b16b. Unfortunately the customer did not respond to
us when we asked them to test their workload when that patch went 
into the mainline. 

I don't have objection to Matthew's fix to remove the bookmark code,
now that it is causing problems with this scenario that I didn't
anticipate in my original code.

Tim





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux