On Fri, 30 Jul 2021 17:20:02 +0100 Aaron Tomlin <atomlin@xxxxxxxxxx> wrote: > Changes since v2: > - Use single character (e.g. 'R' for MMF_OOM_SKIP) as suggested > by Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > - Add new header to oom_dump_tasks documentation > - Provide further justification > > > The output generated by dump_tasks() can be helpful to determine why > there was an OOM condition and which rogue task potentially caused it. > Please note that this is only provided when sysctl oom_dump_tasks is > enabled. > > At the present time, when showing potential OOM victims, we do not > exclude any task that are not OOM eligible e.g. those that have > MMF_OOM_SKIP set; it is possible that the last OOM killable victim was > already OOM killed, yet the OOM reaper failed to reclaim memory and set > MMF_OOM_SKIP. This can be confusing (or perhaps even be misleading) to the > viewer. Now, we already unconditionally display a task's oom_score_adj_min > value that can be set to OOM_SCORE_ADJ_MIN which is indicative of an > "unkillable" task. > > This patch provides a clear indication with regard to the OOM ineligibility > (and why) of each displayed task with the addition of a new column namely > "oom_skipped". An example is provided below: > > [ 5084.524970] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj oom_skipped name > [ 5084.526397] [660417] 0 660417 35869 683 167936 0 -1000 M conmon > [ 5084.526400] [660452] 0 660452 175834 472 86016 0 -998 pod > [ 5084.527460] [752415] 0 752415 35869 650 172032 0 -1000 M conmon > [ 5084.527462] [752575] 1001050000 752575 184205 11158 700416 0 999 npm > [ 5084.527467] [753606] 1001050000 753606 183380 46843 2134016 0 999 node > [ 5084.527581] Memory cgroup out of memory: Killed process 753606 (node) total-vm:733520kB, anon-rss:161228kB, file-rss:26144kB, shmem-rss:0kB, UID:1001050000 > > So, a single character 'M' is for OOM_SCORE_ADJ_MIN, 'R' MMF_OOM_SKIP and > 'V' for in_vfork(). > > index 003d5cc3751b..4c79fa00ddb3 100644 > --- a/Documentation/admin-guide/sysctl/vm.rst > +++ b/Documentation/admin-guide/sysctl/vm.rst > @@ -650,8 +650,9 @@ oom_dump_tasks > Enables a system-wide task dump (excluding kernel threads) to be produced > when the kernel performs an OOM-killing and includes such information as > pid, uid, tgid, vm size, rss, pgtables_bytes, swapents, oom_score_adj > -score, and name. This is helpful to determine why the OOM killer was > -invoked, to identify the rogue task that caused it, and to determine why > +score, oom eligibility status and name. This is helpful to determine why > +the OOM killer was invoked, to identify the rogue task that caused it, and > +to determine why It would be better if the meaning of 'M', 'R' and 'V' were described here. > the OOM killer chose the task it did to kill. > > +/** > + * is_task_eligible_oom - determine if and why a task cannot be OOM killed > + * @tsk: task to check > + * > + * Needs to be called with task_lock(). > + */ > +static const char * const is_task_oom_eligible(struct task_struct *p) Name seems inappropriate. task_oom_eligibility()? > +{ > + long adj; > + > + adj = (long)p->signal->oom_score_adj; > + if (adj == OOM_SCORE_ADJ_MIN) > + return "M"; > + else if (test_bit(MMF_OOM_SKIP, &p->mm->flags) > + return "R"; > + else if (in_vfork(p)) > + return "V"; > + else > + return ""; > +}