On Wed, May 04, 2022 at 12:00:18PM +0300, Vasily Averin <vvs@xxxxxxxxxx> wrote: > My patches protect the host mostly from misuse, when someone creates > a huge number of nedevices inside a container. Understood. > Frankly speaking I do not see a big difference between memcg of current process, > memcg of newly created child and memcg of its parent. I agree that's not substantial difference. It's relevant for outer entities "injecting" something into a subtree. As I wrote previously, charging to the creator in the generic case is sensible. > As far as I understand, Roman chose the parent memcg because it was a special > case of creating a new memory group. He temporally changed active memcg > in mem_cgroup_css_alloc() and properly accounted all required memcg-specific > allocations. > However, he ignored accounting for a rather large struct mem_cgroup > therefore I think we can do not worry about 128 bytes of kernfs node. Are you referring to the current code (>= v5.18-rc2)? All big structs related to mem_cgroup should be accounted. What is ignored? > Primary I mean here struct mem_cgroup allocation in mem_cgroup_alloc(). Just note that memory controller may not be always enabled so cgroup_mkdir != mem_cgroup_alloc(). > However, I think we need to take into account any other distributions called > inside cgroup_mkdir: struct cgroup and kernefs node in common part and > any other cgroup-cpecific allocations in other .css_alloc functions. > They all can be called from inside container, allocates non-accountable > memory and by this way theoretically can be misused. Also note that (if you're purely on unified hierachy) you can protect against that with cgroup.max.descendants and cgroup.max.depth. Thanks, Michal