On Thu 09-07-20 17:01:06, Yafang Shao wrote: > On Thu, Jul 9, 2020 at 4:18 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > On Thu 09-07-20 15:41:11, Yafang Shao wrote: > > > On Thu, Jul 9, 2020 at 2:26 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > > > From: Michal Hocko <mhocko@xxxxxxxx> > > > > > > > > The exported value includes oom_score_adj so the range is no [0, 1000] > > > > as described in the previous section but rather [0, 2000]. Mention that > > > > fact explicitly. > > > > > > > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > > > > --- > > > > Documentation/filesystems/proc.rst | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > > > > index 8e3b5dffcfa8..78a0dec323a3 100644 > > > > --- a/Documentation/filesystems/proc.rst > > > > +++ b/Documentation/filesystems/proc.rst > > > > @@ -1673,6 +1673,9 @@ requires CAP_SYS_RESOURCE. > > > > 3.2 /proc/<pid>/oom_score - Display current oom-killer score > > > > ------------------------------------------------------------- > > > > > > > > +Please note that the exported value includes oom_score_adj so it is effectively > > > > +in range [0,2000]. > > > > + > > > > > > [0, 2000] may be not a proper range, see my reply in another thread.[1] > > > As this value hasn't been documented before and nobody notices that, I > > > think there might be no user really care about it before. > > > So we should discuss the proper range if we really think the user will > > > care about this value. > > > > Even if we decide the range should change, I do not really assume this > > will happen, it is good to have the existing behavior clarified. > > > > But the existing behavior is not defined in the kernel documentation > before, so I don't think that the user has a clear understanding of > the existing behavior. Well, documentation is by no means authoritative, especially when it is outdated or incomplete. What really matters is the observed behavior and a lot of userspace depends on that or based on the specific implementation. > The way to use the result of proc_oom_score is to compare which > processes will be killed first by the OOM killer, IOW, the user should > always use it to compare different processes. For example, > > if proc_oom_score(process_a) > proc_oom_score(process_b) > then > process_a will be killed before process_b > fi > > And then the user will "Use it together with > /proc/<pid>/oom_score_adj to tune which > process should be killed in an out-of-memory situation." > > That means what the user really cares about is the relative value, and > they will not care about the range or the absolute value. In an ideal world yes. But the real life tells a different story. Many times userspace (ab)uses certain undocumented/unintended (mis)features and the hard rule is that we never break userspace. We've learned that through many painful historical experiences. Especially vaguely defined functionality suffers from the problem. -- Michal Hocko SUSE Labs