On 5/5/22 12:47, Michal Koutný wrote: > On Thu, May 05, 2022 at 12:16:12AM +0300, Vasily Averin <vvs@xxxxxxxxxx> wrote: >> I think it should allocate at least 2 pages. > > After decoding kmalloc_type(), I agree this falls into a global > (unaccouted) kmalloc_cache. > >> However if cgroup_mkdir() calls mem_cgroup_alloc() it correctly account huge percpu >> allocations but ignores neighbour multipage allocation. > > So, the spillover is bound and proportional to memcg limit (same ration > like these two sizes). > But it may be better to account it properly, especially if it's > contribution form an offlined mem_cgroup. I've traced mkdir /sys/fs/cgroup/vvs.test on 4cpu VM with Fedora and self-complied upstream kernel, see table with results below. These calculations are not precise, it depends on kernel config options, number of cpus, enabled controllers, ignores possible page allocations etc However I think this is enough to clarify the general situation. Results: - Total sum of accounted memory is ~60Kb. - Accounted only 2 huge percpu allocation marked '=', ~18Kb. (and can be 0 without memory controller) - kernfs nodes and iattrs are among the main memory consumers. they are marked '++' to be accounted first - cgroup_mkdir always allocates 4Kb, so I think it should be accounted first too. - mem_cgroup_css_alloc allocations consumes 10K, it's enough to be accounted, especially for VMs with 1-2 CPUs - Almost all other allocations are quite small and can be ignored. Exceptions are percpu allocations in alloc_fair_sched_group(), this can consume a significant amount of memory on nodes with multiple processors. marked by '+', can be accounted later. - kernfs nodes consumes ~6Kb memory inside simple_xattr_set() and simple_xattr_alloc(). This is quite high numbers, but is not critical, and I think we can ignore it at the moment. - If all proposed memory will be accounted it gives us ~47Kb, or ~75% of all allocated memory. Any comments are welcome. Thank you, Vasily Averin number bytes $1*$2 sum note call_site of alloc allocs ------------------------------------------------------------ 1 14448 14448 14448 = percpu_alloc_percpu: 1 8192 8192 22640 ++ (mem_cgroup_css_alloc+0x54) 49 128 6272 28912 ++ (__kernfs_new_node+0x4e) 49 96 4704 33616 ? (simple_xattr_alloc+0x2c) 49 88 4312 37928 ++ (__kernfs_iattrs+0x56) 1 4096 4096 42024 ++ (cgroup_mkdir+0xc7) 1 3840 3840 45864 = percpu_alloc_percpu: 4 512 2048 47912 + (alloc_fair_sched_group+0x166) 4 512 2048 49960 + (alloc_fair_sched_group+0x139) 1 2048 2048 52008 ++ (mem_cgroup_css_alloc+0x109) 49 32 1568 53576 ? (simple_xattr_set+0x5b) 2 584 1168 54744 (radix_tree_node_alloc.constprop.0+0x8d) 1 1024 1024 55768 (cpuset_css_alloc+0x30) 1 1024 1024 56792 (alloc_shrinker_info+0x79) 1 768 768 57560 percpu_alloc_percpu: 1 640 640 58200 (sched_create_group+0x1c) 33 16 528 58728 (__kernfs_new_node+0x31) 1 512 512 59240 (pids_css_alloc+0x1b) 1 512 512 59752 (blkcg_css_alloc+0x39) 9 48 432 60184 percpu_alloc_percpu: 13 32 416 60600 (__kernfs_new_node+0x31) 1 384 384 60984 percpu_alloc_percpu: 1 256 256 61240 (perf_cgroup_css_alloc+0x1c) 1 192 192 61432 percpu_alloc_percpu: 1 64 64 61496 (mem_cgroup_css_alloc+0x363) 1 32 32 61528 (ioprio_alloc_cpd+0x39) 1 32 32 61560 (ioc_cpd_alloc+0x39) 1 32 32 61592 (blkcg_css_alloc+0x6b) 1 32 32 61624 (alloc_fair_sched_group+0x52) 1 32 32 61656 (alloc_fair_sched_group+0x2e) 3 8 24 61680 (__kernfs_new_node+0x31) 3 8 24 61704 (alloc_cpumask_var_node+0x1b) 1 24 24 61728 percpu_alloc_percpu: