On Mon, Apr 27, 2020 at 04:21:01PM +0000, Christoph Lameter wrote: > On Fri, 24 Apr 2020, Roman Gushchin wrote: > > > > The patch seems to only use it for setup and debugging? It is used for > > > every "accounted" allocation???? Where? And what is an "accounted" > > > allocation? > > > > > > > > > > Please, take a look at the whole series: > > https://lore.kernel.org/linux-mm/20200422204708.2176080-1-guro@xxxxxx/T/#t > > > > I'm sorry, I had to cc you directly for the whole thing. Your feedback > > will be highly appreciated. > > > > It's used to calculate the offset of the memcg pointer for every slab > > object which is charged to a memory cgroup. So it must be quite hot. > > > Ahh... Thanks. I just looked at it. > > You need this because you have a separate structure attached to a page > that tracks membership of the slab object to the cgroup. This is used to > calculate the offset into that array.... > > Why do you need this? Just slap a pointer to the cgroup as additional > metadata onto the slab object. Is that not much simpler, safer and faster? > So, the problem is that not all slab objects are accounted, and sometimes we don't know if advance if they are accounted or not (with the current semantics of __GFP_ACCOUNT and SLAB_ACCOUNT flags). So we either have to increase the size of ALL slab objects, either create a pair of slab caches for each size. The first option is not that cheap in terms of the memory overhead. Especially for those who disable cgroups using a boot-time option. The second should be fine, but it will be less simple in terms of the code complexity (in comparison to the final result of the current proposal). I'm not strictly against of either approach, but I'd look for a broader consensus on what's the best approach here. Thanks!