Linus Torvalds wrote: > On Fri, Mar 18, 2022 at 7:45 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > Excellent! I'm going to propose these two patches for -rc1 (I don't > > think we want to be playing with this after -rc8) > > Ack. I think your commit message may be a bit too optimistic (who > knows if other loads can trigger the over-long page locking wait-queue > latencies), but since I don't see any other ways to really check this > than just trying it, let's do it. > > Linus A report from a tester with this call trace: watchdog: BUG: soft lockup - CPU#127 stuck for 134s! [ksoftirqd/127:782] RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [..] Call Trace: <TASK> folio_wake_bit+0x8a/0x110 folio_end_writeback+0x37/0x80 ext4_finish_bio+0x19a/0x270 ext4_end_bio+0x47/0x140 blk_update_request+0x112/0x410 ...lead me to this thread. This was after I had them force all softirqs to run in ksoftirqd context, and run with rq_affinity == 2 to force I/O completion work to throttle new submissions. Willy, are these headed upstream: https://lore.kernel.org/all/YjSbHp6B9a1G3tuQ@xxxxxxxxxxxxxxxxxxxx ...or I am missing an alternate solution posted elsewhere?