From: David Rientjes <rientjes@xxxxxxxxxx> Subject: mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes Since the 2.6 kernel, the oom killer has slightly biased away from CAP_SYS_ADMIN processes by discounting some of its memory usage in comparison to other processes. This has always been implicit and nothing exactly relies on the behavior. Gaurav notices that __task_cred() can dereference a potentially freed pointer if the task under consideration is exiting because a reference to the task_struct is not held. Remove the CAP_SYS_ADMIN bias so that all processes are treated equally. If any CAP_SYS_ADMIN process would like to be biased against, it is always allowed to adjust /proc/pid/oom_score_adj. Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803071548510.6996@xxxxxxxxxxxxxxxxxxxxxxxxx Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> Reported-by: Gaurav Kohli <gkohli@xxxxxxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/oom_kill.c | 7 ------- 1 file changed, 7 deletions(-) diff -puN mm/oom_kill.c~mm-oom-remove-3%-bonus-for-cap_sys_admin-processes mm/oom_kill.c --- a/mm/oom_kill.c~mm-oom-remove-3%-bonus-for-cap_sys_admin-processes +++ a/mm/oom_kill.c @@ -226,13 +226,6 @@ unsigned long oom_badness(struct task_st mm_pgtables_bytes(p->mm) / PAGE_SIZE; task_unlock(p); - /* - * Root processes get 3% bonus, just like the __vm_enough_memory() - * implementation used by LSMs. - */ - if (has_capability_noaudit(p, CAP_SYS_ADMIN)) - points -= (points * 3) / 100; - /* Normalize to oom_score_adj units */ adj *= totalpages / 1000; points += adj; _ -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html