On Thu, 7 Mar 2024 11:19:52 +0800 Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > After we enabled mglru on our 384C1536GB production servers, we > encountered frequent soft lockups attributed to scanning folios. > > The soft lockup as follows, > > ... > > There were a total of 22 tasks waiting for this spinlock > (RDI: ffff99d2b6ff9050): > > crash> foreach RU bt | grep -B 8 queued_spin_lock_slowpath | grep "RDI: ffff99d2b6ff9050" | wc -l > 22 If we're holding the lock for this long then there's a possibility of getting hit by the NMI watchdog also. > Additionally, two other threads were also engaged in scanning folios, one > with 19 waiters and the other with 15 waiters. > > To address this issue under heavy reclaim conditions, we introduced a > hotfix version of the fix, incorporating cond_resched() in scan_folios(). > Following the application of this hotfix to our servers, the soft lockup > issue ceased. > > ... > > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -4367,6 +4367,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > > if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH) > break; > + > + spin_unlock_irq(&lruvec->lru_lock); > + cond_resched(); > + spin_lock_irq(&lruvec->lru_lock); > } Presumably wrapping this with `if (need_resched())' will save some work. This lock is held for a reason. I'd like to see an analysis of why this change is safe.