On Mon, Apr 12, 2021 at 10:03:13AM -0400, Waiman Long wrote: > On 4/9/21 9:51 PM, Roman Gushchin wrote: > > On Fri, Apr 09, 2021 at 07:18:37PM -0400, Waiman Long wrote: > > > With the recent introduction of the new slab memory controller, we > > > eliminate the need for having separate kmemcaches for each memory > > > cgroup and reduce overall kernel memory usage. However, we also add > > > additional memory accounting overhead to each call of kmem_cache_alloc() > > > and kmem_cache_free(). > > > > > > For workloads that require a lot of kmemcache allocations and > > > de-allocations, they may experience performance regression as illustrated > > > in [1]. > > > > > > With a simple kernel module that performs repeated loop of 100,000,000 > > > kmem_cache_alloc() and kmem_cache_free() of 64-byte object at module > > > init. The execution time to load the kernel module with and without > > > memory accounting were: > > > > > > with accounting = 6.798s > > > w/o accounting = 1.758s > > > > > > That is an increase of 5.04s (287%). With this patchset applied, the > > > execution time became 4.254s. So the memory accounting overhead is now > > > 2.496s which is a 50% reduction. > > Hi Waiman! > > > > Thank you for working on it, it's indeed very useful! > > A couple of questions: > > 1) did your config included lockdep or not? > The test kernel is based on a production kernel config and so lockdep isn't > enabled. > > 2) do you have a (rough) estimation how much each change contributes > > to the overall reduction? > > I should have a better breakdown of the effect of individual patches. I > rerun the benchmarking module with turbo-boosting disabled to reduce > run-to-run variation. The execution times were: > > Before patch: time = 10.800s (with memory accounting), 2.848s (w/o > accounting), overhead = 7.952s > After patch 2: time = 9.140s, overhead = 6.292s > After patch 3: time = 7.641s, overhead = 4.793s > After patch 5: time = 6.801s, overhead = 3.953s Thank you! If there will be v2, I'd include this information into commit logs. > > Patches 1 & 4 are preparatory patches that should affect performance. > > So the memory accounting overhead was reduced by about half. This is really great! Thanks!