On 4/27/22 19:52, Shakeel Butt wrote: > On Wed, Apr 27, 2022 at 7:01 AM Michal Koutný <mkoutny@xxxxxxxx> wrote: >> >> Hello Vasily. >> >> On Wed, Apr 27, 2022 at 01:37:50PM +0300, Vasily Averin <vvs@xxxxxxxxxx> wrote: >>> diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c >>> index cfa79715fc1a..2881aeeaa880 100644 >>> --- a/fs/kernfs/mount.c >>> +++ b/fs/kernfs/mount.c >>> @@ -391,7 +391,7 @@ void __init kernfs_init(void) >>> { >>> kernfs_node_cache = kmem_cache_create("kernfs_node_cache", >>> sizeof(struct kernfs_node), >>> - 0, SLAB_PANIC, NULL); >>> + 0, SLAB_PANIC | SLAB_ACCOUNT, NULL); >> >> kernfs accounting you say? >> kernfs backs up also cgroups, so the parent-child accounting comes to my >> mind. >> See the temporary switch to parent memcg in mem_cgroup_css_alloc(). >> >> (I mean this makes some sense but I'd suggest unlumping the kernfs into >> a separate path for possible discussion and its not-only-netdevice >> effects.) > > I agree with Michal that kernfs accounting should be its own patch. > Internally at Google, we actually have enabled the memcg accounting of > kernfs nodes. We have workloads which create 100s of subcontainers and > without memcg accounting of kernfs we see high system overhead. I had this idea (i.e. move kernfs accounting into separate patch) too, but finally decided to include it into current patch. Kernfs accounting is critical for described scenario. Without it typical netdevice creating will charge only ~50% of allocated memory, and the rest of patch does not allow to protect the host properly. Now I'm going to follow your recommendation and split the patch. Thank you, Vasily Averin