On Tue, Oct 22, 2019 at 03:31:48PM +0200, Michal Hocko wrote: > On Thu 17-10-19 17:28:04, Roman Gushchin wrote: > > This patchset provides a new implementation of the slab memory controller, > > which aims to reach a much better slab utilization by sharing slab pages > > between multiple memory cgroups. Below is the short description of the new > > design (more details in commit messages). > > > > Accounting is performed per-object instead of per-page. Slab-related > > vmstat counters are converted to bytes. Charging is performed on page-basis, > > with rounding up and remembering leftovers. > > > > Memcg ownership data is stored in a per-slab-page vector: for each slab page > > a vector of corresponding size is allocated. To keep slab memory reparenting > > working, instead of saving a pointer to the memory cgroup directly an > > intermediate object is used. It's simply a pointer to a memcg (which can be > > easily changed to the parent) with a built-in reference counter. This scheme > > allows to reparent all allocated objects without walking them over and changing > > memcg pointer to the parent. > > > > Instead of creating an individual set of kmem_caches for each memory cgroup, > > two global sets are used: the root set for non-accounted and root-cgroup > > allocations and the second set for all other allocations. This allows to > > simplify the lifetime management of individual kmem_caches: they are destroyed > > with root counterparts. It allows to remove a good amount of code and make > > things generally simpler. > > What is the performance impact? As I wrote, so far we haven't found any regression on any real world workload. Of course, it's pretty easy to come up with a synthetic test which will show some performance hit: e.g. allocate and free a large number of objects from a single cache from a single cgroup. The reason is simple: stats and accounting are more precise, so it requires more work. But I don't think it's a real problem. On the other hand I expect to see some positive effects from the significantly reduced number of unmovable pages: memory fragmentation should become lower. And all kernel objects will reside on a smaller number of pages, so we can expect a better cache utilization. > Also what is the effect on the memory > reclaim side and the isolation. I would expect that mixing objects from > different cgroups would have a negative/unpredictable impact on the > memcg slab shrinking. Slab shrinking is already working on per-object basis, so no changes here. Quite opposite: now the freed space can be reused by other cgroups, where previously it was often a useless operation, as nobody can reuse the space unless all objects will be freed and the page can be returned to the page allocator. Thanks!