certainly.
Steps to reproduce:
(1)Create a mm cgroup and set memory.limit_in_bytes
(2)Move the bash process to the newly created cgroup, and set the oom_score_adj of the bash process to -998.
(3)In bash, start multiple processes, each process consumes different memory until cgroup oom is triggered.
The triggered phenomenon is shown below. We can see that when cgroup oom happened, process 23777 was killed, but in fact, 23772 consumes more memory;
[ 591.000970] Tasks state (memory values in pages):
[ 591.000970] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 591.000973] [ 23344] 0 23344 2863 923 61440 0 -998 bash
[ 591.000975] [ 23714] 0 23714 27522 25935 258048 0 -998 test
[ 591.000976] [ 23772] 0 23772 104622 103032 876544 0 -998 test
[ 591.000978] [ 23777] 0 23777 78922 77335 667648 0 -998 test
[ 591.000980] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-1,oom_memcg=/test,task_memcg=/test,task=test,pid=23777,uid=0
[ 591.000986] Memory cgroup out of memory: Killed process 23777 (test) total-vm:315688kB, anon-rss:308420kB, file-rss:920kB, shmem-rss:0kB, UID:0 pgtables:667648kB oom_score_adj:-998
[ 591.000970] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 591.000973] [ 23344] 0 23344 2863 923 61440 0 -998 bash
[ 591.000975] [ 23714] 0 23714 27522 25935 258048 0 -998 test
[ 591.000976] [ 23772] 0 23772 104622 103032 876544 0 -998 test
[ 591.000978] [ 23777] 0 23777 78922 77335 667648 0 -998 test
[ 591.000980] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-1,oom_memcg=/test,task_memcg=/test,task=test,pid=23777,uid=0
[ 591.000986] Memory cgroup out of memory: Killed process 23777 (test) total-vm:315688kB, anon-rss:308420kB, file-rss:920kB, shmem-rss:0kB, UID:0 pgtables:667648kB oom_score_adj:-998
The verification process is the same. After applying this repair patch, we can find that when the oom cgroup occurs, the process that consumes the most memory is killed first. The effect is shown below:
[195118.961768] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[195118.961770] [ 22283] 0 22283 2862 911 69632 0 -998 bash
[195118.961771] [ 79244] 0 79244 27522 25922 262144 0 -998 test
[195118.961773] [ 79247] 0 79247 53222 51596 462848 0 -998 test
[195118.961776] [ 79263] 0 79263 58362 56744 507904 0 -998 test
[195118.961777] [ 79267] 0 79267 45769 44005 409600 0 -998 test
[195118.961779] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-1,oom_memcg=/test,task_memcg=/test,task=test,pid=79263,uid=0
[195118.961786] Memory cgroup out of memory: Killed process 79263 (test) total-vm:233448kB, anon-rss:226048kB, file-rss:928kB, shmem-rss:0kB, UID:0 pgtables:507904kB oom_score_adj:-998
Michal Hocko <mhocko@xxxxxxxxxx> 于2019年12月20日周五 下午3:13写道:
On Fri 20-12-19 14:26:12, zgpeng.linux@xxxxxxxxx wrote:
> From: zgpeng <zgpeng@xxxxxxxxxxx>
>
> It has been found in multiple business scenarios that when a oom occurs
> in a cgroup, the process that consumes the most memory in the cgroup is
> not killed first. Analysis of the reasons found that each process in the
> cgroup oom_score_adj is set to -998, oom_badness in the calculation of
> points, if points is negative, uniformly set it to 1.
Can you provide an example of the oom report?
--
Michal Hocko
SUSE Labs