Re: [PATCH v6 00/19] The new cgroup slab memory controller

Roman Gushchin <guro@xxxxxx> · Fri, 19 Jun 2020 17:57:18 -0700

On Wed, Jun 17, 2020 at 03:31:10PM +0100, Mel Gorman wrote:
> On Wed, Jun 17, 2020 at 01:24:21PM +0200, Vlastimil Babka wrote:
> > > Not really.
> > > 
> > > Sharing a single set of caches adds some overhead to root- and non-accounted
> > > allocations, which is something I've tried hard to avoid in my original version.
> > > But I have to admit, it allows to simplify and remove a lot of code, and here
> > > it's hard to argue with Johanness, who pushed on this design.
> > > 
> > > With performance testing it's not that easy, because it's not obvious what
> > > we wanna test. Obviously, per-object accounting is more expensive, and
> > > measuring something like 1000000 allocations and deallocations in a line from
> > > a single kmem_cache will show a regression. But in the real world the relative
> > > cost of allocations is usually low, and we can get some benefits from a smaller
> > > working set and from having shared kmem_cache objects cache hot.
> > > Not speaking about some extra memory and the fragmentation reduction.
> > > 
> > > We've done an extensive testing of the original version in Facebook production,
> > > and we haven't noticed any regressions so far. But I have to admit, we were
> > > using an original version with two sets of kmem_caches.
> > > 
> > > If you have any specific tests in mind, I can definitely run them. Or if you
> > > can help with the performance evaluation, I'll appreciate it a lot.
> > 
> > Jesper provided some pointers here [1], it would be really great if you could
> > run at least those microbenchmarks. With mmtests it's the major question of
> > which subset/profiles to run, maybe the referenced commits provide some hints,
> > or maybe Mel could suggest what he used to evaluate SLAB vs SLUB not so long ago.
> > 
> 
> Last time the list of mmtests configurations I used for a basic
> comparison were
> 
> db-pgbench-timed-ro-small-ext4
> db-pgbench-timed-ro-small-xfs
> io-dbench4-async-ext4
> io-dbench4-async-xfs
> io-bonnie-dir-async-ext4
> io-bonnie-dir-async-xfs
> io-bonnie-file-async-ext4
> io-bonnie-file-async-xfs
> io-fsmark-xfsrepair-xfs
> io-metadata-xfs
> network-netperf-unbound
> network-netperf-cross-node
> network-netperf-cross-socket
> network-sockperf-unbound
> network-netperf-unix-unbound
> network-netpipe
> network-tbench
> pagereclaim-shrinker-ext4
> scheduler-unbound
> scheduler-forkintensive
> workload-kerndevel-xfs
> workload-thpscale-madvhugepage-xfs
> workload-thpscale-xfs
> 
> Some were more valid than others in terms of doing an evaluation. I
> followed up later with a more comprehensive comparison but that was
> overkill.
> 
> Each time I did a slab/slub comparison in the past, I had to reverify
> the rate that kmem_cache_* functions were actually being called as the
> pattern can change over time even for the same workload.  A comparison
> gets more complicated when comparing cgroups as ideally there would be
> workloads running in multiple group but that gets complex and I think
> it's reasonable to just test the "basic" case without cgroups.

Thank you Mel for the suggestion!

I'll try to come up with some numbers soon. I guess networking tests
will be most interesting in this case.

Thanks!

Roman