Hello Michael. On Wed, Apr 12, 2023 at 09:22:07PM -0400, Michael Honaker <mchonaker@xxxxxxxxx> wrote: > I have been trying to get an accurate measurement of memory usage of a > non-root cgroup, specifically a Kubernetes container, Beware that containers are more or less based on sharing resources, shared accounting is difficult and hence _accurate_ measurement may not be available or the numbers need some amount of interpretation. > and noticed some inconsistencies when comparing the value of > `memory.usage_in_bytes` with the information in `memory.stat`. After > further investigation of the cgroup docs > (/admin-guide/cgroups/memory.rst#usage_in_bytes) and an old LMKL > thread ("real meaning of memory.usage_in_bytes"), [OT: I suggest you move to cgroup v2, the entities are IMO better named and it's also more futureproof ;-)] > I came to the understanding that `usage_in_bytes` actually shows the > value of the resource counter which is an overestimation due to the > counter being split into per-cpu chunks for caching, I didn't read the thread but it's true that per-cpu batching may result in an error (both signs in theory). Since around v5.13 the implementation changed and error should be better: O(nr_cpus * nr_cgroups(subtree) * MEMCG_CHARGE_BATCH) -> O(nr_cpus * MEMCG_CHARGE_BATCH). > and that the real usage can be calculated from RSS+Cache gathered from > `memory.stat`. I've created cadvisor issue #3286 > (https://github.com/google/cadvisor/issues/3286) which goes into > greater detail on my investigation with examples. The difference that you spot there is not caused (merely) by the per-cpu optimization. What you see as the difference is mainly kernel memory (e.g. dentries, inodes, task_struct,...) -- RSS+Cache would only show memory that userspace is directly responsible for but not the kernel structures (whose size depends on kernel implementation afterall). (On v2, you could see breakdown of the kernel memory usage besides others in memory.stat.) > Is the above understanding still correct with the new page counters? > If so, could any memory allocations be reflected in `usage_in_bytes` > but not in `stat` for child cgroups? I want to ensure I'm not > missing anything by only monitoring the `stat` file. I hope the abve sheds some light on these questions. Michal
Attachment:
signature.asc
Description: PGP signature