Re: [PATCH mm v5 0/9] memcg: accounting for objects allocated by mkdir, cgroup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Michal,
do you still have any concerns about this patch set?

Thank you,
	Vasily Averin

On 6/23/22 17:50, Vasily Averin wrote:
> In some cases, creating a cgroup allocates a noticeable amount of memory.
> This operation can be executed from inside memory-limited container,
> but currently this memory is not accounted to memcg and can be misused.
> This allow container to exceed the assigned memory limit and avoid
> memcg OOM. Moreover, in case of global memory shortage on the host,
> the OOM-killer may not find a real memory eater and start killing
> random processes on the host.
> 
> This is especially important for OpenVZ and LXC used on hosting,
> where containers are used by untrusted end users.
> 
> Below is tracing results of mkdir /sys/fs/cgroup/vvs.test on 
> 4cpu VM with Fedora and self-complied upstream kernel. The calculations
> are not precise, it depends on kernel config options, number of cpus,
> enabled controllers, ignores possible page allocations etc.
> However this is enough to clarify the general situation.
> All allocations are splitted into:
> - common part, always called for each cgroup type
> - per-cgroup allocations
> 
> In each group we consider 2 corner cases:
> - usual allocations, important for 1-2 CPU nodes/Vms
> - percpu allocations, important for 'big irons'
> 
> common part: 	~11Kb	+  318 bytes percpu
> memcg: 		~17Kb	+ 4692 bytes percpu
> cpu:		~2.5Kb	+ 1036 bytes percpu
> cpuset:		~3Kb	+   12 bytes percpu
> blkcg:		~3Kb	+   12 bytes percpu
> pid:		~1.5Kb	+   12 bytes percpu		
> perf:		 ~320b	+   60 bytes percpu
> -------------------------------------------
> total:		~38Kb	+ 6142 bytes percpu
> currently accounted:	  4668 bytes percpu
> 
> - it's important to account usual allocations called
> in common part, because almost all of cgroup-specific allocations
> are small. One exception here is memory cgroup, it allocates a few
> huge objects that should be accounted.
> - Percpu allocation called in common part, in memcg and cpu cgroups
> should be accounted, rest ones are small an can be ignored.
> - KERNFS objects are allocated both in common part and in most of
> cgroups 
> 
> Details can be found here:
> https://lore.kernel.org/all/d28233ee-bccb-7bc3-c2ec-461fd7f95e6a@xxxxxxxxxx/
> 
> I checked other cgroups types was found that they all can be ignored.
> Additionally I found allocation of struct rt_rq called in cpu cgroup 
> if CONFIG_RT_GROUP_SCHED was enabled, it allocates huge (~1700 bytes)
> percpu structure and should be accounted too.
> 
> v5:
>  1) re-based to linux-mm (mm-everything-2022-06-22-20-36)
> 
> v4:
>  1) re-based to linux-next (next-20220610)
>    now psi_group is not a part of struct cgroup and is allocated on demand
>  2) added received approval from Muchun Song
>  3) improved cover letter description according to akpm@ request
> 
> v3:
>  1) re-based to current upstream (v5.18-11267-gb00ed48bb0a7)
>  2) fixed few typos
>  3) added received approvals
> 
> v2:
>  1) re-split to simplify possible bisect, re-ordered
>  2) added accounting for percpu psi_group_cpu and cgroup_rstat_cpu,
>      allocated in common part
>  3) added accounting for percpu allocation of struct rt_rq
>      (actual if CONFIG_RT_GROUP_SCHED is enabled)
>  4) improved patches descriptions 
> 
> Vasily Averin (9):
>   memcg: enable accounting for struct cgroup
>   memcg: enable accounting for kernfs nodes
>   memcg: enable accounting for kernfs iattrs
>   memcg: enable accounting for struct simple_xattr
>   memcg: enable accounting for percpu allocation of struct psi_group_cpu
>   memcg: enable accounting for percpu allocation of struct
>     cgroup_rstat_cpu
>   memcg: enable accounting for large allocations in mem_cgroup_css_alloc
>   memcg: enable accounting for allocations in alloc_fair_sched_group
>   memcg: enable accounting for perpu allocation of struct rt_rq
> 
>  fs/kernfs/mount.c      | 6 ++++--
>  fs/xattr.c             | 2 +-
>  kernel/cgroup/cgroup.c | 2 +-
>  kernel/cgroup/rstat.c  | 3 ++-
>  kernel/sched/fair.c    | 4 ++--
>  kernel/sched/psi.c     | 2 +-
>  kernel/sched/rt.c      | 2 +-
>  mm/memcontrol.c        | 4 ++--
>  8 files changed, 14 insertions(+), 11 deletions(-)
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux