Michal Hocko wrote: > On Sun 03-07-16 11:45:34, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > index 7d0a275df822..4ea4a649822d 100644 > > > --- a/mm/oom_kill.c > > > +++ b/mm/oom_kill.c > > > @@ -286,16 +286,17 @@ enum oom_scan_t oom_scan_process_thread(struct oom_control *oc, > > > * Don't allow any other task to have access to the reserves unless > > > * the task has MMF_OOM_REAPED because chances that it would release > > > * any memory is quite low. > > > + * MMF_OOM_NOT_REAPABLE means that the oom_reaper backed off last time > > > + * so let it try again. > > > */ > > > if (!is_sysrq_oom(oc) && atomic_read(&task->signal->oom_victims)) { > > > - struct task_struct *p = find_lock_task_mm(task); > > > + struct mm_struct *mm = task->signal->oom_mm; > > > enum oom_scan_t ret = OOM_SCAN_ABORT; > > > > > > - if (p) { > > > - if (test_bit(MMF_OOM_REAPED, &p->mm->flags)) > > > - ret = OOM_SCAN_CONTINUE; > > > - task_unlock(p); > > > - } > > > + if (test_bit(MMF_OOM_REAPED, &mm->flags)) > > > + ret = OOM_SCAN_CONTINUE; > > > + else if (test_bit(MMF_OOM_NOT_REAPABLE, &mm->flags)) > > > + ret = OOM_SCAN_SELECT; > > > > I don't think this is useful. > > Well, to be honest me neither but changing the retry logic is not in > scope of this patch. It just preserved the existing logic. I guess we > can get rid of it but that deserves a separate patch. The retry was > implemented to cover unlikely stalls when the lock is held but as this > hasn't ever been observed in the real life I would agree to remove it to > simplify the code (even though it is literally few lines of code). I was > probably overcautious when adding the flag. > You mean reverting http://lkml.kernel.org/r/1466426628-15074-10-git-send-email-mhocko@xxxxxxxxxx ? If we hit a situation where MMF_OOM_NOT_REAPABLE is set, it means that that mm was used by multiple threads and one of them is blocked. On the other hand, since currently task_struct->oom_reaper_list is used, we can hit (say, T1 and T2 and T3 are sharing the same mm) (1) The T1's mm is queued to oom_reaper_list for the first time by T1. (2) The OOM reaper finds that mm for the first time. (3) The OOM reaper fails to hold mm->mmap_sem for read because T3 is blocked with that mm->mmap_sem held for write. (4) The T2's mm (which is same with T1's mm) is queued to oom_reaper_list for the second time by T2. (5) The OOM reaper still fails to hold mm->mmap_sem for read because T3 is blocked with that mm->mmap_sem held for write. (6) The OOM reaper sets MMF_OOM_NOT_REAPABLE. (7) That mm is dequeued from oom_reaper_list for the first time by the OOM reaper. (8) The OOM reaper finds that mm for the second time. (9) The OOM reaper still fails to hold mm->mmap_sem for read because T3 is blocked with that mm->mmap_sem held for write. (10) The OOM reaper sets MMF_OOM_REAPED. (11) That mm is dequeued from oom_reaper_list for the second time by the OOM reaper. sequences. To me, MMF_OOM_NOT_REAPABLE alone is unlikely helpful. If oom_mm_list list which chains mm_struct is used, at least we won't concurrently queue same mm which is currently under OOM reaper's operation. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>