On Mon, May 02, 2022 at 03:15:51PM +0300, Vasily Averin wrote: > Creating a new netdevice allocates at least ~50Kb of memory for various > kernel objects, but only ~5Kb of them are accounted to memcg. As a result, > creating an unlimited number of netdevice inside a memcg-limited container > does not fall within memcg restrictions, consumes a significant part > of the host's memory, can cause global OOM and lead to random kills of > host processes. > > The main consumers of non-accounted memory are: > ~10Kb 80+ kernfs nodes > ~6Kb ipv6_add_dev() allocations > 6Kb __register_sysctl_table() allocations > 4Kb neigh_sysctl_register() allocations > 4Kb __devinet_sysctl_register() allocations > 4Kb __addrconf_sysctl_register() allocations > > Accounting of these objects allows to increase the share of memcg-related > memory up to 60-70% (~38Kb accounted vs ~54Kb total for dummy netdevice > on typical VM with default Fedora 35 kernel) and this should be enough > to somehow protect the host from misuse inside container. > > Other related objects are quite small and may not be taken into account > to minimize the expected performance degradation. > > It should be separately mentonied ~300 bytes of percpu allocation > of struct ipstats_mib in snmp6_alloc_dev(), on huge multi-cpu nodes > it can become the main consumer of memory. > > This patch does not enables kernfs accounting as it affects > other parts of the kernel and should be discussed separately. > However, even without kernfs, this patch significantly improves the > current situation and allows to take into account more than half > of all netdevice allocations. > > Signed-off-by: Vasily Averin <vvs@xxxxxxxxxx> > --- > v2: 1) kernfs accounting moved into separate patch, suggested by > Shakeel and mkoutny@. > 2) in ipv6_add_dev() changed original "sizeof(struct inet6_dev)" > to "sizeof(*ndev)", according to checkpath.pl recommendation: > CHECK: Prefer kzalloc(sizeof(*ndev)...) over kzalloc(sizeof > (struct inet6_dev)...) > --- > fs/proc/proc_sysctl.c | 2 +- for proc_sysctl: Acked-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> Luis