Re: + mm-consider-subtrees-in-memoryevents.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 13, 2019 at 4:47 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Tue 12-02-19 14:45:42, Andrew Morton wrote:
> [...]
> > From: Chris Down <chris@xxxxxxxxxxxxxx>
> > Subject: mm, memcg: consider subtrees in memory.events
> >
> > memory.stat and other files already consider subtrees in their output, and
> > we should too in order to not present an inconsistent interface.
> >
> > The current situation is fairly confusing, because people interacting with
> > cgroups expect hierarchical behaviour in the vein of memory.stat,
> > cgroup.events, and other files.  For example, this causes confusion when
> > debugging reclaim events under low, as currently these always read "0" at
> > non-leaf memcg nodes, which frequently causes people to misdiagnose breach
> > behaviour.  The same confusion applies to other counters in this file when
> > debugging issues.
> >
> > Aggregation is done at write time instead of at read-time since these
> > counters aren't hot (unlike memory.stat which is per-page, so it does it
> > at read time), and it makes sense to bundle this with the file

I think the above statement (memory.stat read-time aggregation) need
to be fixed due to the latest changes.

> > notifications.
> >
> > After this patch, events are propagated up the hierarchy:
> >
> >     [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
> >     low 0
> >     high 0
> >     max 0
> >     oom 0
> >     oom_kill 0
> >     [root@ktst ~]# systemd-run -p MemoryMax=1 true
> >     Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service
> >     [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
> >     low 0
> >     high 0
> >     max 7
> >     oom 1
> >     oom_kill 1
> >
> > As this is a change in behaviour, this can be reverted to the old
> > behaviour by mounting with the `memory_localevents' flag set.  However, we
> > use the new behaviour by default as there's a lack of evidence that there
> > are any current users of memory.events that would find this change
> > undesirable.
> >
> > Link: http://lkml.kernel.org/r/20190208224419.GA24772@xxxxxxxxxxxxxx
> > Signed-off-by: Chris Down <chris@xxxxxxxxxxxxxx>
> > Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>

Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx>

However can we please have memory.events.local merged along with this one?

> > Cc: Michal Hocko <mhocko@xxxxxxxxxx>
> > Cc: Tejun Heo <tj@xxxxxxxxxx>
> > Cc: Roman Gushchin <guro@xxxxxx>
> > Cc: Dennis Zhou <dennis@xxxxxxxxxx>
> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>
> FTR: As I've already said here [1] I can live with this change as long
> as there is a larger consensus among cgroup v2 users. So let's give this
> some more time before merging to see whether there is such a consensus.
>
> [1] http://lkml.kernel.org/r/20190201102515.GK11599@xxxxxxxxxxxxxx
>
> > ---
> >
> >  Documentation/admin-guide/cgroup-v2.rst |    9 +++++++++
> >  include/linux/cgroup-defs.h             |    5 +++++
> >  include/linux/memcontrol.h              |   10 ++++++++--
> >  kernel/cgroup/cgroup.c                  |   16 ++++++++++++++--
> >  4 files changed, 36 insertions(+), 4 deletions(-)
> >
> > --- a/Documentation/admin-guide/cgroup-v2.rst~mm-consider-subtrees-in-memoryevents
> > +++ a/Documentation/admin-guide/cgroup-v2.rst
> > @@ -177,6 +177,15 @@ cgroup v2 currently supports the followi
> >       ignored on non-init namespace mounts.  Please refer to the
> >       Delegation section for details.
> >
> > +  memory_localevents
> > +
> > +        Only populate memory.events with data for the current cgroup,
> > +        and not any subtrees. This is legacy behaviour, the default
> > +        behaviour without this option is to include subtree counts.
> > +        This option is system wide and can only be set on mount or
> > +        modified through remount from the init namespace. The mount
> > +        option is ignored on non-init namespace mounts.
> > +
> >
> >  Organizing Processes and Threads
> >  --------------------------------
> > --- a/include/linux/cgroup-defs.h~mm-consider-subtrees-in-memoryevents
> > +++ a/include/linux/cgroup-defs.h
> > @@ -83,6 +83,11 @@ enum {
> >        * Enable cpuset controller in v1 cgroup to use v2 behavior.
> >        */
> >       CGRP_ROOT_CPUSET_V2_MODE = (1 << 4),
> > +
> > +     /*
> > +      * Enable legacy local memory.events.
> > +      */
> > +     CGRP_ROOT_MEMORY_LOCAL_EVENTS = (1 << 5),
> >  };
> >
> >  /* cftype->flags */
> > --- a/include/linux/memcontrol.h~mm-consider-subtrees-in-memoryevents
> > +++ a/include/linux/memcontrol.h
> > @@ -789,8 +789,14 @@ static inline void count_memcg_event_mm(
> >  static inline void memcg_memory_event(struct mem_cgroup *memcg,
> >                                     enum memcg_memory_event event)
> >  {
> > -     atomic_long_inc(&memcg->memory_events[event]);
> > -     cgroup_file_notify(&memcg->events_file);
> > +     do {
> > +             atomic_long_inc(&memcg->memory_events[event]);
> > +             cgroup_file_notify(&memcg->events_file);
> > +
> > +             if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
> > +                     break;
> > +     } while ((memcg = parent_mem_cgroup(memcg)) &&
> > +              !mem_cgroup_is_root(memcg));
> >  }
> >
> >  static inline void memcg_memory_event_mm(struct mm_struct *mm,
> > --- a/kernel/cgroup/cgroup.c~mm-consider-subtrees-in-memoryevents
> > +++ a/kernel/cgroup/cgroup.c
> > @@ -1775,11 +1775,13 @@ int cgroup_show_path(struct seq_file *sf
> >
> >  enum cgroup2_param {
> >       Opt_nsdelegate,
> > +     Opt_memory_localevents,
> >       nr__cgroup2_params
> >  };
> >
> >  static const struct fs_parameter_spec cgroup2_param_specs[] = {
> > -     fsparam_flag  ("nsdelegate",            Opt_nsdelegate),
> > +     fsparam_flag("nsdelegate",              Opt_nsdelegate),
> > +     fsparam_flag("memory_localevents",      Opt_memory_localevents),
> >       {}
> >  };
> >
> > @@ -1802,6 +1804,9 @@ static int cgroup2_parse_param(struct fs
> >       case Opt_nsdelegate:
> >               ctx->flags |= CGRP_ROOT_NS_DELEGATE;
> >               return 0;
> > +     case Opt_memory_localevents:
> > +             ctx->flags |= CGRP_ROOT_MEMORY_LOCAL_EVENTS;
> > +             return 0;
> >       }
> >       return -EINVAL;
> >  }
> > @@ -1813,6 +1818,11 @@ static void apply_cgroup_root_flags(unsi
> >                       cgrp_dfl_root.flags |= CGRP_ROOT_NS_DELEGATE;
> >               else
> >                       cgrp_dfl_root.flags &= ~CGRP_ROOT_NS_DELEGATE;
> > +
> > +             if (root_flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
> > +                     cgrp_dfl_root.flags |= CGRP_ROOT_MEMORY_LOCAL_EVENTS;
> > +             else
> > +                     cgrp_dfl_root.flags &= ~CGRP_ROOT_MEMORY_LOCAL_EVENTS;
> >       }
> >  }
> >
> > @@ -1820,6 +1830,8 @@ static int cgroup_show_options(struct se
> >  {
> >       if (cgrp_dfl_root.flags & CGRP_ROOT_NS_DELEGATE)
> >               seq_puts(seq, ",nsdelegate");
> > +     if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
> > +             seq_puts(seq, ",memory_localevents");
> >       return 0;
> >  }
> >
> > @@ -6207,7 +6219,7 @@ static struct kobj_attribute cgroup_dele
> >  static ssize_t features_show(struct kobject *kobj, struct kobj_attribute *attr,
> >                            char *buf)
> >  {
> > -     return snprintf(buf, PAGE_SIZE, "nsdelegate\n");
> > +     return snprintf(buf, PAGE_SIZE, "nsdelegate\nmemory_localevents\n");
> >  }
> >  static struct kobj_attribute cgroup_features_attr = __ATTR_RO(features);
> >
> > _
> >
> > Patches currently in -mm which might be from chris@xxxxxxxxxxxxxx are
> >
> > mm-create-mem_cgroup_from_seq.patch
> > mm-extract-memcg-maxable-seq_file-logic-to-seq_show_memcg_tunable.patch
> > mm-proportional-memorylowmin-reclaim.patch
> > mm-proportional-memorylowmin-reclaim-fix.patch
> > mm-memcontrol-expose-thp-events-on-a-per-memcg-basis.patch
> > mm-memcontrol-expose-thp-events-on-a-per-memcg-basis-fix-2.patch
> > mm-make-memoryemin-the-baseline-for-utilisation-determination.patch
> > mm-rename-ambiguously-named-memorystat-counters-and-functions.patch
> > mm-consider-subtrees-in-memoryevents.patch
>
> --
> Michal Hocko
> SUSE Labs



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux