Sorry for the late response. On Mon, Feb 17, 2025 at 06:57:46PM +0100, Michal Koutný wrote: > Hello. > [...] > > The most simple explanation is visibility. Workloads that used to run > > solo are being moved to a multi-tenant but non-overcommited environment > > and they need to know their capacity which they used to get from system > > metrics. > > > Now they have to get from cgroup limit files but usage of > > cgroup namespace limits those workloads to extract the needed > > information. > > I remember Shakeel said the limit may be set higher in the hierarchy for > container + siblings but then it's potentially overcommitted, no? > > I.e. namespace visibility alone is not the problem. The cgns root's > memory.max is the shared medium between host and guest through which the > memory allowance can be passed -- that actually sounds to me like > Johannes' option b). > > (Which leads me to an idea of memory.max.effective that'd only present > the value iff there's no sibling between tightest ancestor..self. If one > looks at nr_tasks, it's partial but correct memory available. Not that > useful due to the partiality.) > > Since I was originally fan of the idea, I'm not a strong opponent of > plain memory.max.effective, especially when Johannes considers the > option of kernel stepping back here and it may help some users. But I'd > like to see the original incarnations [2] somehow linked (and maybe > start only with memory.max as > that has some usecases). Yes, I can link [2] with more info added to the commit message. Johannes, do you want effective interface for low and min as well or for now just keep the current targeted interfaces? > > Thanks, > Michal > > [1] https://lore.kernel.org/all/ZcY7NmjkJMhGz8fP@xxxxxxxxxxxxxxxxxxxxxxx/ > [2] https://lore.kernel.org/all/20240606152232.20253-1-mkoutny@xxxxxxxx/