On 2019/06/26 19:47, Michal Hocko wrote: > On Wed 26-06-19 19:19:20, Tetsuo Handa wrote: >> Is "mempolicy_nodemask_intersects(tsk) returning true when tsk already >> passed mpol_put_task_policy(tsk) in do_exit()" what we want? >> >> If tsk is an already exit()ed thread group leader, that thread group is >> needlessly selected by the OOM killer because mpol_put_task_policy() >> returns true? > > I am sorry but I do not really see how this is related to this > particular patch. Are you suggesting that has_intersects_mems_allowed is > racy? More racy now? I'm suspecting the correctness of has_intersects_mems_allowed(). If mask != NULL, mempolicy_nodemask_intersects() is called on each thread in "start" thread group. And as soon as mempolicy_nodemask_intersects(tsk) returned true, has_intersects_mems_allowed(start) returns true and "start" is considered as an OOM victim candidate. And if one of threads in "tsk" thread group has already passed mpol_put_task_policy(tsk) in do_exit() (e.g. dead thread group leader), mempolicy_nodemask_intersects(tsk) returns true because tsk->mempolicy == NULL. I don't know how mempolicy works, but can mempolicy be configured differently for per a thread basis? If each thread in "start" thread group cannot have different mempolicy->mode, there is (mostly) no need to use for_each_thread() at has_intersects_mems_allowed(). Instead, we can use find_lock_task_mm(start) (provided that MMF_OOM_SKIP is checked like /* p may not have freeable memory in nodemask */ - if (!is_memcg_oom(oc) && !has_intersects_mems_allowed(task, oc)) + if (!tsk_is_oom_victim(task) && !is_memcg_oom(oc) && !is_sysrq_oom(oc) && + !has_intersects_mems_allowed(task, oc)) goto next; ) because find_lock_task_mm() == NULL thread groups won't be selected as an OOM victim candidate...