On Thu, Jun 04, 2020 at 11:32:20PM +0200, Christian Brauner wrote: > > Is it desirable to have meminfo and cpuinfo as they are today or do > > people want them to reflect the ``container'' context. So that > > applications like the JVM don't allocation too many cpus or don't try > > and consume too much memory, or run on nodes that cgroups current make > > unavailable. > > > > Are there any users or planned users of this functionality yet? > > > > I am concerned that you might be adding functionality that no one will > > ever use that will just add code to the kernel that no one cares about, > > that will then accumulate bugs. Having had to work through a few of > > those cases to make each mount of proc have it's own super block I am > > not a great fan of adding another one. > > > > If the runc, lxc and other container runtime folks can productively use > > such and option to do useful things and they are sensible things to do I > > don't have any fundamental objection. But I do want to be certain this > > is a feature that is going to be used. > > I'm not sure Alexey is introducing virtualized meminfo and cpuinfo (but > I haven't had time to look at this patchset). No. Not yet :) I just suggest a way to restrict access to files in the procfs inside a container about which you know nothing. > In any case, we are currently virtualizing: > /proc/cpuinfo > /proc/diskstats > /proc/loadavg > /proc/meminfo > /proc/stat > /proc/swaps > /proc/uptime > for each container with a tiny in-userspace filesystem LXCFS > ( https://github.com/lxc/lxcfs ) > and have been doing that for years. I know about it. The reason for the appearance of such a solution is also clear. > Having meminfo and cpuinfo virtualized in procfs was something we have > been wanting for a long time and there have been patches by other people > (from Siteground, I believe) to achieve this a few years back but were > disregarded. > > I think meminfo and cpuinfo would already be great. And if we're > virtualizing cpuinfo we also need to virtualize the cpu bits exposed in > /proc/stat. It would also be great to virtualize /proc/uptime. Right now > we're achieving this essentially by substracting the time the init > process of the pid namespace has started since system boot time, minus > the time when the system started to get the actual reaper age (It's a > bit more involved but that's the gist.). > > This is all on the topic list for this year's virtual container's > microconference at Plumber's and I would suggest we try to discuss the > various requirements for something like this there. (I'm about to send > the CFP out.) > > Christian > -- Rgrds, legion