Hi Michal, On Wed, Oct 13, 2021 at 11:01 AM Michal Koutný <mkoutny@xxxxxxxx> wrote: > > On Fri, Oct 01, 2021 at 12:00:39PM -0700, Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > In this patch we kept the stats update codepath very minimal and let the > > stats reader side to flush the stats only when the updates are over a > > specific threshold. For now the threshold is (nr_cpus * CHARGE_BATCH). > > BTW, a noob question -- are the updates always single page sized? > > This is motivated by apples vs oranges comparison since the > nr_cpus * MEMCG_CHARGE_BATCH > suggests what could the expected error be in pages (bytes). But it's mostly > wrong since: a) uncertain single-page updates, b) various counter > updates summed together. I wonder whether the formula can serve to > provide at least some (upper) estimate. > Thanks for your review. This forces me to think more on this because each update does not necessarily be a single page sized update e.g. adding a hugepage to an LRU. Though I think the error is time bounded by 2 seconds but in those 2 seconds mathematically the error can be large. What do you think of the following change? It will bound the error better within the 2 seconds window. >From e87a36eedd02b0d10d8f66f83833bd6e2bae17b8 Mon Sep 17 00:00:00 2001 From: Shakeel Butt <shakeelb@xxxxxxxxxx> Date: Thu, 14 Oct 2021 08:49:06 -0700 Subject: [PATCH] Better bounds on the stats error --- mm/memcontrol.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 8f1d9c028897..e5d5c850a521 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -626,14 +626,20 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) static void flush_memcg_stats_dwork(struct work_struct *w); static DECLARE_DEFERRABLE_WORK(stats_flush_dwork, flush_memcg_stats_dwork); static DEFINE_SPINLOCK(stats_flush_lock); -static DEFINE_PER_CPU(unsigned int, stats_updates); +static DEFINE_PER_CPU(int, stats_diff); static atomic_t stats_flush_threshold = ATOMIC_INIT(0); -static inline void memcg_rstat_updated(struct mem_cgroup *memcg) +static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) { + unsigned int x; + cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id()); - if (!(__this_cpu_inc_return(stats_updates) % MEMCG_CHARGE_BATCH)) - atomic_inc(&stats_flush_threshold); + + x = abs(__this_cpu_add_return(stats_diff, val)); + if (x > MEMCG_CHARGE_BATCH) { + atomic_add(x / MEMCG_CHARGE_BATCH, &stats_flush_threshold); + __this_cpu_write(stats_diff, 0); + } } static void __mem_cgroup_flush_stats(void) @@ -672,7 +678,7 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) return; __this_cpu_add(memcg->vmstats_percpu->state[idx], val); - memcg_rstat_updated(memcg); + memcg_rstat_updated(memcg, val); } /* idx can be of type enum memcg_stat_item or node_stat_item. */ @@ -705,7 +711,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, /* Update lruvec */ __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); - memcg_rstat_updated(memcg); + memcg_rstat_updated(memcg, val); } /** @@ -807,7 +813,7 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, return; __this_cpu_add(memcg->vmstats_percpu->events[idx], count); - memcg_rstat_updated(memcg); + memcg_rstat_updated(memcg, val); } static unsigned long memcg_events(struct mem_cgroup *memcg, int event) -- 2.33.0.882.g93a45727a2-goog