On Thu 25-08-22 00:05:04, Shakeel Butt wrote: > For cgroups using low or min protections, the function > propagate_protected_usage() was doing an atomic xchg() operation > irrespectively. We can optimize out this atomic operation for one > specific scenario where the workload is using the protection (i.e. > min > 0) and the usage is above the protection (i.e. usage > min). > > This scenario is actually very common where the users want a part of > their workload to be protected against the external reclaim. Though this > optimization does introduce a race when the usage is around the > protection and concurrent charges and uncharged trip it over or under > the protection. In such cases, we might see lower effective protection > but the subsequent charge/uncharge will correct it. Thanks this is much more useful > To evaluate the impact of this optimization, on a 72 CPUs machine, we > ran the following workload in a three level of cgroup hierarchy with top > level having min and low setup appropriately to see if this optimization > is effective for the mentioned case. > > $ netserver -6 > # 36 instances of netperf with following params > $ netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K > > Results (average throughput of netperf): > Without (6.0-rc1) 10482.7 Mbps > With patch 14542.5 Mbps (38.7% improvement) > > With the patch, the throughput improved by 38.7% > > Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > Acked-by: Soheil Hassas Yeganeh <soheil@xxxxxxxxxx> > Reviewed-by: Feng Tang <feng.tang@xxxxxxxxx> > Acked-by: Roman Gushchin <roman.gushchin@xxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Thanks! > --- > Changes since v1: > - Commit message update with more detail on which scenario is getting > optimized and possible race condition. > > mm/page_counter.c | 13 ++++++------- > 1 file changed, 6 insertions(+), 7 deletions(-) > > diff --git a/mm/page_counter.c b/mm/page_counter.c > index eb156ff5d603..47711aa28161 100644 > --- a/mm/page_counter.c > +++ b/mm/page_counter.c > @@ -17,24 +17,23 @@ static void propagate_protected_usage(struct page_counter *c, > unsigned long usage) > { > unsigned long protected, old_protected; > - unsigned long low, min; > long delta; > > if (!c->parent) > return; > > - min = READ_ONCE(c->min); > - if (min || atomic_long_read(&c->min_usage)) { > - protected = min(usage, min); > + protected = min(usage, READ_ONCE(c->min)); > + old_protected = atomic_long_read(&c->min_usage); > + if (protected != old_protected) { > old_protected = atomic_long_xchg(&c->min_usage, protected); > delta = protected - old_protected; > if (delta) > atomic_long_add(delta, &c->parent->children_min_usage); > } > > - low = READ_ONCE(c->low); > - if (low || atomic_long_read(&c->low_usage)) { > - protected = min(usage, low); > + protected = min(usage, READ_ONCE(c->low)); > + old_protected = atomic_long_read(&c->low_usage); > + if (protected != old_protected) { > old_protected = atomic_long_xchg(&c->low_usage, protected); > delta = protected - old_protected; > if (delta) > -- > 2.37.1.595.g718a3a8f04-goog -- Michal Hocko SUSE Labs