On Wed, May 11, 2022 at 12:43 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Fri 06-05-22 22:09:16, Ganesan Rajagopal wrote: > > We run a lot of automated tests when building our software and run into > > OOM scenarios when the tests run unbounded. v1 memcg exports > > memcg->watermark as "memory.max_usage_in_bytes" in sysfs. We use this > > metric to heuristically limit the number of tests that can run in > > parallel based on per test historical data. > > > > This metric is currently not exported for v2 memcg and there is no > > other easy way of getting this information. getrusage() syscall returns > > "ru_maxrss" which can be used as an approximation but that's the max > > RSS of a single child process across all children instead of the > > aggregated max for all child processes. The only work around is to > > periodically poll "memory.current" but that's not practical for > > short-lived one-off cgroups. > > > > Hence, expose memcg->watermark as "memory.peak" for v2 memcg. > > Yes, I can imagine that a very short lived process can easily escape > from the monitoring. The memory consumption can be still significant > though. > > The v1 interface allows to reset the value by writing to the file. Have > you considered that as well? I hadn't originally but this was discussed and dropped when I posted the first version of this patch. See https://www.spinics.net/lists/cgroups/msg32476.html Ganesan > > > Signed-off-by: Ganesan Rajagopal <rganesan@xxxxxxxxxx> > > Acked-by: Michal Hocko <mhocko@xxxxxxxx> > > > --- > > Documentation/admin-guide/cgroup-v2.rst | 7 +++++++ > > mm/memcontrol.c | 13 +++++++++++++ > > 2 files changed, 20 insertions(+) > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > > index 69d7a6983f78..828ce037fb2a 100644 > > --- a/Documentation/admin-guide/cgroup-v2.rst > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > @@ -1208,6 +1208,13 @@ PAGE_SIZE multiple when read back. > > high limit is used and monitored properly, this limit's > > utility is limited to providing the final safety net. > > > > + memory.peak > > + A read-only single value file which exists on non-root > > + cgroups. > > + > > + The max memory usage recorded for the cgroup and its > > + descendants since the creation of the cgroup. > > + > > memory.oom.group > > A read-write single value file which exists on non-root > > cgroups. The default value is "0". > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 725f76723220..88fa70b5d8af 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -6098,6 +6098,14 @@ static u64 memory_current_read(struct cgroup_subsys_state *css, > > return (u64)page_counter_read(&memcg->memory) * PAGE_SIZE; > > } > > > > +static u64 memory_peak_read(struct cgroup_subsys_state *css, > > + struct cftype *cft) > > +{ > > + struct mem_cgroup *memcg = mem_cgroup_from_css(css); > > + > > + return (u64)memcg->memory.watermark * PAGE_SIZE; > > +} > > + > > static int memory_min_show(struct seq_file *m, void *v) > > { > > return seq_puts_memcg_tunable(m, > > @@ -6361,6 +6369,11 @@ static struct cftype memory_files[] = { > > .flags = CFTYPE_NOT_ON_ROOT, > > .read_u64 = memory_current_read, > > }, > > + { > > + .name = "peak", > > + .flags = CFTYPE_NOT_ON_ROOT, > > + .read_u64 = memory_peak_read, > > + }, > > { > > .name = "min", > > .flags = CFTYPE_NOT_ON_ROOT, > > -- > > 2.28.0 > > -- > Michal Hocko > SUSE Labs