On Sun, Jun 21, 2020 at 06:57:52PM -0400, Qian Cai wrote: > On Mon, Jun 08, 2020 at 04:06:35PM -0700, Roman Gushchin wrote: > > This is v6 of the slab cgroup controller rework. > > > > The patchset moves the accounting from the page level to the object > > level. It allows to share slab pages between memory cgroups. > > This leads to a significant win in the slab utilization (up to 45%) > > and the corresponding drop in the total kernel memory footprint. > > The reduced number of unmovable slab pages should also have a positive > > effect on the memory fragmentation. > > > > The patchset makes the slab accounting code simpler: there is no more > > need in the complicated dynamic creation and destruction of per-cgroup > > slab caches, all memory cgroups use a global set of shared slab caches. > > The lifetime of slab caches is not more connected to the lifetime > > of memory cgroups. > > > > The more precise accounting does require more CPU, however in practice > > the difference seems to be negligible. We've been using the new slab > > controller in Facebook production for several months with different > > workloads and haven't seen any noticeable regressions. What we've seen > > were memory savings in order of 1 GB per host (it varied heavily depending > > on the actual workload, size of RAM, number of CPUs, memory pressure, etc). > > > > The third version of the patchset added yet another step towards > > the simplification of the code: sharing of slab caches between > > accounted and non-accounted allocations. It comes with significant > > upsides (most noticeable, a complete elimination of dynamic slab caches > > creation) but not without some regression risks, so this change sits > > on top of the patchset and is not completely merged in. So in the unlikely > > event of a noticeable performance regression it can be reverted separately. > > Reverting this series and its dependency [1], i.e., > > git revert --no-edit 05923a2ccacd..07666ee77fb4 > > on the top of next-20200621 fixed an issue where kmemleak could report > thousands of leaks like this below using this .config (if ever matters), > > https://github.com/cailca/linux-mm/blob/master/x86.config > > unreferenced object 0xffff888ff2bf6200 (size 512): Hi Qian! My wild guess is that kmemleak is getting confused by modifying the lowest bit of page->mem_cgroup/obhj_cgroups pointer: struct page { ... union { struct mem_cgroup *mem_cgroup; struct obj_cgroup **obj_cgroups; }; ... } We're using the lowest bit to distinguish between a "normal" mem_cgroup pointer and a vector of obj_cgroup pointers. This pointer to obj_cgroup vector is saved only here, so if we're modifying the address, I guess it's what makes kmemleak think that there is a leak. Or do you have a real leak? Thanks!