On Mon, 13 Jan 2025 14:51:55 +0800 Chen Ridong <chenridong@xxxxxxxxxxxxxxx> wrote: > > > On 2025/1/6 16:45, Vlastimil Babka wrote: > > On 12/24/24 03:52, Chen Ridong wrote: > >> From: Chen Ridong <chenridong@xxxxxxxxxx> > > > > +CC RCU > > > >> A soft lockup issue was found in the product with about 56,000 tasks were > >> in the OOM cgroup, it was traversing them when the soft lockup was > >> triggered. > >> > > ... > > >> @@ -430,10 +431,15 @@ static void dump_tasks(struct oom_control *oc) > >> mem_cgroup_scan_tasks(oc->memcg, dump_task, oc); > >> else { > >> struct task_struct *p; > >> + int i = 0; > >> > >> rcu_read_lock(); > >> - for_each_process(p) > >> + for_each_process(p) { > >> + /* Avoid potential softlockup warning */ > >> + if ((++i & 1023) == 0) > >> + touch_softlockup_watchdog(); > > > > This might suppress the soft lockup, but won't a rcu stall still be detected? > > Yes, rcu stall was still detected. > For global OOM, system is likely to struggle, do we have to do some > works to suppress RCU detete? rcu_cpu_stall_reset()?