Re: [PATCH v3] mm/oom_kill: show oom eligibility when displaying the current memory state of all tasks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri 30-07-21 17:20:02, Aaron Tomlin wrote:
> Changes since v2:
>  - Use single character (e.g. 'R' for MMF_OOM_SKIP) as suggested
>    by Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
>  - Add new header to oom_dump_tasks documentation
>  - Provide further justification
> 
> 
> The output generated by dump_tasks() can be helpful to determine why
> there was an OOM condition and which rogue task potentially caused it.
> Please note that this is only provided when sysctl oom_dump_tasks is
> enabled.
> 
> At the present time, when showing potential OOM victims, we do not
> exclude any task that are not OOM eligible e.g.

Well, this is not precise. We do exclude ineligible. Consider tasks that
are outside of the OOM domain for example. You are right that we are not
excluding all of them though.

> those that have
> MMF_OOM_SKIP set; it is possible that the last OOM killable victim was
> already OOM killed, yet the OOM reaper failed to reclaim memory and set
> MMF_OOM_SKIP. This can be confusing (or perhaps even be misleading) to the
> viewer. Now, we already unconditionally display a task's oom_score_adj_min
> value that can be set to OOM_SCORE_ADJ_MIN which is indicative of an
> "unkillable" task.
> 
> This patch provides a clear indication with regard to the OOM ineligibility
> (and why) of each displayed task with the addition of a new column namely
> "oom_skipped". An example is provided below:
> 
>     [ 5084.524970] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj oom_skipped name
>     [ 5084.526397] [660417]     0 660417    35869      683   167936        0         -1000 M conmon
>     [ 5084.526400] [660452]     0 660452   175834      472    86016        0          -998  pod
>     [ 5084.527460] [752415]     0 752415    35869      650   172032        0         -1000 M conmon
>     [ 5084.527462] [752575] 1001050000 752575   184205    11158   700416        0           999  npm
>     [ 5084.527467] [753606] 1001050000 753606   183380    46843  2134016        0           999  node
>     [ 5084.527581] Memory cgroup out of memory: Killed process 753606 (node) total-vm:733520kB, anon-rss:161228kB, file-rss:26144kB, shmem-rss:0kB, UID:1001050000
> 
> So, a single character 'M' is for OOM_SCORE_ADJ_MIN, 'R' MMF_OOM_SKIP and
> 'V' for in_vfork().

I have to say I dislike this for two reasons. First and foremost it
makes parsing unnecessarily more complex. Now you have a potentially
an empty column to special case. Secondly MMF_OOM_SKIP is an internal
state that shouldn't really leak to userspace IMO. in_vfork is a racy
check and M is already expressed via oom_score_adj so it duplicates an
existing information.

If you really want/need to make any change here then I would propose to
either add E(eligible)/I(ligible) column without any specifics or
consistently skip over all tasks which are not eligible.
> 2.31.1

-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux