On Thu 05-11-15 17:32:51, Johannes Weiner wrote: > On Thu, Nov 05, 2015 at 05:28:03PM +0100, Michal Hocko wrote: [...] > > Yes, that part is clear and Johannes made it clear that the kmem tcp > > part is disabled by default. Or are you considering also all the slab > > usage by the networking code as well? > > Michal, there shouldn't be any tracking or accounting going on per > default when you boot into a fresh system. > > I removed all accounting and statistics on the system level in > cgroupv2, so distribution kernels can compile-time enable a single, > feature-complete CONFIG_MEMCG that provides a full memory controller > while at the same time puts no overhead on users that don't benefit > from mem control at all and just want to use the machine bare-metal. Yes that part is clear and I am not disputing it _at all_. It is just that changes are high that memory controller _will_ be enabled in a typical distribution systems. E.g. systemd _is_ enabling all resource controllers by default for some services with Delegate=yes option. > This is completely doable. My new series does it for skmem, but I also > want to retrofit the code to eliminate that current overhead for page > cache, anonymous memory, slab memory and so forth. > > This is the only sane way to make the memory controller powerful and > generally useful without having to make unreasonable compromises with > memory consumers. We shouldn't even be *having* the discussion about > whether we should sacrifice the quality of our interface in order to > compromise with a class of users that doesn't care about any of this > in the first place. > > So let's eliminate the cost for non-users, but make the memory > controller feature-complete and useful--with reasonable cost, > implementation, and interface--for our actual userbase. > > Paying the necessary cost for a functionality you actually want is not > the problem. Paying for something that doesn't benefit you is. I completely agree that a reasonable cost for those who _want_ the functionality. It hasn't been shown that people actually lack kmem accounting in the wild from the past in general. E.g. kmem controller is even not enabled in opensuse nor SLES kernels and I do not remember there was huge push to enable it. I do understand that you want to have an out-of-the-box isolation behavior which I agree is a nice-to-have feature. Especially with a larger penetration of containerized workloads. But my point still holds. This is not something everybody wants to have. So have a configuration and a boot time option to override is the most reasonable way to go. You can clearly see that this is already demand from tcp kmem extension because they really _care_ about every single cpu cycle even though some part of the userspace happens to have memcg enabled. The question about the configuration default is a different question and we can discuss that because this is not an easy one to decide right now IMHO. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html