On Thu 20-06-24 23:27:45, alexjlzheng@xxxxxxxxx wrote: > From: Jinliang Zheng <alexjlzheng@xxxxxxxxxxx> > > When mm_update_next_owner() is racing with swapoff (try_to_unuse()) or /proc or > ptrace or page migration (get_task_mm()), it is impossible to find an > appropriate task_struct in the loop whose mm_struct is the same as the target > mm_struct. > > If the above race condition is combined with the stress-ng-zombie and > stress-ng-dup tests, such a long loop can easily cause a Hard Lockup in > write_lock_irq() for tasklist_lock. > > Recognize this situation in advance and exit early. > > Signed-off-by: Jinliang Zheng <alexjlzheng@xxxxxxxxxxx> Even if this is not really a full fix it is a useful stop gap to catch at least some cases. Acked-by: Michal Hocko <mhocko@xxxxxxxx> > --- > Changelog: > > V2: Fix mm_update_owner_next() to mm_update_next_owner() in comment > --- > kernel/exit.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/exit.c b/kernel/exit.c > index f95a2c1338a8..81fcee45d630 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -484,6 +484,8 @@ void mm_update_next_owner(struct mm_struct *mm) > * Search through everything else, we should not get here often. > */ > for_each_process(g) { > + if (atomic_read(&mm->mm_users) <= 1) > + break; > if (g->flags & PF_KTHREAD) > continue; > for_each_thread(g, c) { > -- > 2.39.3 > -- Michal Hocko SUSE Labs