On 2019/01/22 3:50, Shakeel Butt wrote: >>From the start of the git history of Linux, the kernel after selecting > the worst process to be oom-killed, prefer to kill its child (if the > child does not share mm with the parent). Later it was changed to prefer > to kill a child who is worst. If the parent is still the worst then the > parent will be killed. > > This heuristic assumes that the children did less work than their parent > and by killing one of them, the work lost will be less. However this is > very workload dependent. If there is a workload which can benefit from > this heuristic, can use oom_score_adj to prefer children to be killed > before the parent. > > The select_bad_process() has already selected the worst process in the > system/memcg. There is no need to recheck the badness of its children > and hoping to find a worse candidate. That's a lot of unneeded racy > work. Also the heuristic is dangerous because it make fork bomb like > workloads to recover much later because we constantly pick and kill > processes which are not memory hogs. So, let's remove this whole > heuristic. > > Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > Acked-by: Michal Hocko <mhocko@xxxxxxxx> > Cc: Roman Gushchin <guro@xxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: David Rientjes <rientjes@xxxxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > Cc: linux-mm@xxxxxxxxx > Cc: linux-kernel@xxxxxxxxxxxxxxx > > --- > Changelog since v1: > - Improved commit message based on mhocko's comment. > - Replaced 'p' with 'victim'. > - Removed extra pr_err message. But this version omits printing one of "Out of memory (oom_kill_allocating_task)", "Out of memory" and "Memory cgroup out of memory" message which is unexpected. We want to propagate that message to __oom_kill_process() ? ;-)