Michal Hocko <mhocko@xxxxxxxx> 于2022年8月30日周二 14:45写道: > > On Tue 30-08-22 13:59:48, Kairui Song wrote: > > From: Kairui Song <kasong@xxxxxxxxxxx> > > > > There are currently two helpers for checking if cgroup kmem > > accounting is enabled: > > > > - mem_cgroup_kmem_disabled > > - memcg_kmem_enabled > > Yes, this is a bit confusing indeed! > > > mem_cgroup_kmem_disabled is a simple helper that returns true if > > cgroup.memory=nokmem is specified, otherwise returns false. > > > > memcg_kmem_enabled is a bit different, it returns true if > > cgroup.memory=nokmem is not specified and there is at least one > > non-root cgroup ever created. And once there is any non-root memcg > > created, it won't go back to return false again. > > > > This may help improve performance for some corner use cases where > > the user enables memory cgroup and kmem accounting globally but never > > create any cgroup. > > > > Considering that corner case is rare, especially nowadays cgroup is > > widely used as a standard way to organize services. > > Is it really that rare? Most configurations would use a default setup, so > both MEMCG enabled and without nokmem on cmd line yet the memory > controller is not enabled in their setups. Actually I don't have too much confidence saying that as well... but AFAIK, almost all distros will create a few sub cgroup on boot by the init (eg. openrc, finit, systemd). Maybe it's not that rare indeed. > > > And the "once > > enabled never disable" behavior is kind of strange. This commit simplifies > > the behavior of memcg_kmem_enabled, making it simply the opposite of > > mem_cgroup_kmem_disabled, always true if cgroup.memory=nokmem is > > not specified. So mem_cgroup_kmem_disabled can be dropped. > > > > This simplifies the code, and besides, memcg_kmem_enabled makes use > > of static key so it has a lower overhead. > > I agree that this is slightly confusing and undocumented. The first step > would be finding out why we need both outside of the memcg proper. > > E.g. it doesn't make much sense to me that count_objcg_event uses the > command line variant when it should be using the dynamic (and more > optimized no branch) variant. > > On the other hand pcpu_alloc_chunk seems to be different because it can > be called before the controller is enabled but maybe we do not need to > waste memory before that? Similarly new_kmalloc_cache. I suspect these > are mostly to simplify the code and reduce special casing. Yes, that's very insightful, let me tidy up the code and logic behind and send a V2 later.