On 10/17/19 8:28 PM, Roman Gushchin wrote: > The existing slab memory controller is based on the idea of replicating > slab allocator internals for each memory cgroup. This approach promises > a low memory overhead (one pointer per page), and isn't adding too much > code on hot allocation and release paths. But is has a very serious flaw: ^it^ > it leads to a low slab utilization. > > Using a drgn* script I've got an estimation of slab utilization on > a number of machines running different production workloads. In most > cases it was between 45% and 65%, and the best number I've seen was > around 85%. Turning kmem accounting off brings it to high 90s. Also > it brings back 30-50% of slab memory. It means that the real price > of the existing slab memory controller is way bigger than a pointer > per page. > > The real reason why the existing design leads to a low slab utilization > is simple: slab pages are used exclusively by one memory cgroup. > If there are only few allocations of certain size made by a cgroup, > or if some active objects (e.g. dentries) are left after the cgroup is > deleted, or the cgroup contains a single-threaded application which is > barely allocating any kernel objects, but does it every time on a new CPU: > in all these cases the resulting slab utilization is very low. > If kmem accounting is off, the kernel is able to use free space > on slab pages for other allocations. In the case of slub memory allocator, it is not just unused space within a slab. It is also the use of per-cpu slabs that can hold up a lot of memory, especially if the tasks jump around to different cpus. The problem is compounded if a lot of memcgs are being used. Memory utilization can improve quite significantly if per-cpu slabs are disabled. Of course, it comes with a performance cost. Cheers, Longman