panic_on_oom allows administrator to set OOM policy to panic the system when it is out of memory to reduce failover time e.g. when resolving the OOM condition would take much more time than rebooting the system. out_of_memory tries to be clever and prevent from premature panics by checking the current task and prevent from panic when the task has fatal signal pending and so it should die shortly and release some memory. This is fair enough but Tetsuo Handa has noted that this might lead to a silent deadlock when current cannot exit because of dependencies invisible to the OOM killer. panic_on_oom is disabled by default and if somebody enables it then any risk of potential deadlock is certainly unwelcome. The risk is really low because there are usually more sources of allocation requests and one of them would eventually trigger the panic but it is better to reduce the risk as much as possible. Let's move check_panic_on_oom up before the current task is checked so that the knob value is . Do the same for the memcg in mem_cgroup_out_of_memory. Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> Signed-off-by: Michal Hocko <mhocko@xxxxxxx> --- mm/memcontrol.c | 3 ++- mm/oom_kill.c | 18 +++++++++--------- 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 86648a718d21..d3c906da6a09 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1532,6 +1532,8 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, mutex_lock(&oom_lock); + check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL, memcg); + /* * If current has a pending SIGKILL or is exiting, then automatically * select it. The goal is to allow it to allocate so that it may @@ -1542,7 +1544,6 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, goto unlock; } - check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL, memcg); totalpages = mem_cgroup_get_limit(memcg) ? : 1; for_each_mem_cgroup_tree(iter, memcg) { struct css_task_iter it; diff --git a/mm/oom_kill.c b/mm/oom_kill.c index dff991e0681e..f8c83b791dd5 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -667,6 +667,15 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, goto out; /* + * Check if there were limitations on the allocation (only relevant for + * NUMA) that may require different handling. + */ + constraint = constrained_alloc(zonelist, gfp_mask, nodemask, + &totalpages); + mpol_mask = (constraint == CONSTRAINT_MEMORY_POLICY) ? nodemask : NULL; + check_panic_on_oom(constraint, gfp_mask, order, mpol_mask, NULL); + + /* * If current has a pending SIGKILL or is exiting, then automatically * select it. The goal is to allow it to allocate so that it may * quickly exit and free its memory. @@ -680,15 +689,6 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, goto out; } - /* - * Check if there were limitations on the allocation (only relevant for - * NUMA) that may require different handling. - */ - constraint = constrained_alloc(zonelist, gfp_mask, nodemask, - &totalpages); - mpol_mask = (constraint == CONSTRAINT_MEMORY_POLICY) ? nodemask : NULL; - check_panic_on_oom(constraint, gfp_mask, order, mpol_mask, NULL); - if (sysctl_oom_kill_allocating_task && current->mm && !oom_unkillable_task(current, NULL, nodemask) && current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) { -- 2.1.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>