Re: writeback completion soft lockup BUG in folio_wake_bit()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 19, 2022 at 6:35 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> A report from a tester with this call trace:
>
>  watchdog: BUG: soft lockup - CPU#127 stuck for 134s! [ksoftirqd/127:782]
>  RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [..]

Whee.

> ...lead me to this thread. This was after I had them force all softirqs
> to run in ksoftirqd context, and run with rq_affinity == 2 to force
> I/O completion work to throttle new submissions.
>
> Willy, are these headed upstream:
>
> https://lore.kernel.org/all/YjSbHp6B9a1G3tuQ@xxxxxxxxxxxxxxxxxxxx
>
> ...or I am missing an alternate solution posted elsewhere?

Can your reporter test that patch? I think it should still apply
pretty much as-is.. And if we actually had somebody who had a
test-case that was literally fixed by getting rid of the old bookmark
code, that would make applying that patch a no-brainer.

The problem is that the original load that caused us to do that thing
in the first place isn't repeatable because it was special production
code - so removing that bookmark code because we _think_ it now hurts
more than it helps is kind of a big hurdle.

But if we had some hard confirmation from somebody that "yes, the
bookmark code is now hurting", that would make it a lot more palatable
to just remove the code that we just _think_ that probably isn't
needed any more..

                  Linus



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux