On Tue 09-10-18 08:35:41, Michal Hocko wrote: > [I have only now noticed that the patch has been reposted] > > On Mon 08-10-18 18:27:39, Tetsuo Handa wrote: > > On 2018/10/08 17:38, Yong-Taek Lee wrote: [...] > > > Thank you for your suggestion. But i think it would be better to seperate to 2 issues. How about think these > > > issues separately because there are no dependency between race issue and my patch. As i already explained, > > > for_each_process path is meaningless if there is only one thread group with many threads(mm_users > 1 but > > > no other thread group sharing same mm). Do you have any other idea to avoid meaningless loop ? > > > > Yes. I suggest reverting commit 44a70adec910d692 ("mm, oom_adj: make sure processes > > sharing mm have same view of oom_score_adj") and commit 97fd49c2355ffded ("mm, oom: > > kill all tasks sharing the mm"). > > This would require a lot of other work for something as border line as > weird threading model like this. I will think about something more > appropriate - e.g. we can take mmap_sem for read while doing this check > and that should prevent from races with [v]fork. Not really. We do not even take the mmap_sem when CLONE_VM. So this is not the way. Doing a proper synchronization seems much harder. So let's consider what is the worst case scenario. We would basically hit a race window between copy_signal and copy_mm and the only relevant case would be OOM_SCORE_ADJ_MIN which wouldn't propagate to the new "thread". OOM killer could then pick up the "thread" and kill it along with the whole process group sharing the mm. Well, that is unfortunate indeed and it breaks the OOM_SCORE_ADJ_MIN contract. There are basically two ways here 1) do not care and encourage users to use a saner way to set OOM_SCORE_ADJ_MIN because doing that externally is racy anyway e.g. setting it before [v]fork & exec. Btw. do we know about an actual user who would care? 2) add OOM_SCORE_ADJ_MIN and do not kill tasks sharing mm and do not reap the mm in the rare case of the race. I would prefer the firs but if this race really has to be addressed then the 2 sounds more reasonable than the wholesale revert. -- Michal Hocko SUSE Labs