Quoting Rob Landley (rlandley@xxxxxxxxxxxxx): > On 01/06/2011 03:43 PM, Matt Helsley wrote: > > On Wed, Jan 05, 2011 at 07:46:17PM +0530, Balbir Singh wrote: > >> On Wed, Jan 5, 2011 at 7:31 PM, Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx> wrote: > >>> Quoting Daniel Lezcano (daniel.lezcano@xxxxxxx): > >>>> On 01/05/2011 10:40 AM, Mike Hommey wrote: > >>>>> [Copy/pasted from a previous message to lkml, where it was suggested to > >>>>> try containers@] > >>>>> > >>>>> Hi, > >>>>> > >>>>> I noticed that from within a lxc container, writing "3" to > >>>>> /proc/sys/vm/drop_caches would flush the host page cache. That sounds a > >>>>> little dangerous for VPS offerings that would be based on lxc, as in one > >>>>> VPS instance root user could impact the overall performance of the host. > >>>>> I don't know about other containers but I've been told openvz isn't > >>>>> subject to this problem. > >>>>> I only tested the current Debian Squeeze kernel, which is based on > >>>>> 2.6.32.27. > >>>> > >>>> There is definitively a big work to do with /proc. > >>>> > >>>> Some files should be not accessible (/proc/sys/vm/drop_caches, > >>>> /proc/sys/kernel/sysrq, ...) and some other should be virtualized > >>>> (/proc/meminfo, /proc/cpuinfo, ...). > >>>> > >>>> Serge suggested to create something similar to the cgroup device > >>>> whitelist but for /proc, maybe it is a good approach for denying > >>>> access a specific proc's file. > >>> > >>> Long-term, user namespaces should fix this - /proc will be owned > >>> by the user namespace which mounted it, but we can tell proc to > >>> always have some files (like drop_caches) be owned by init_user_ns. > > Changing ownership so a script can't open a file that it otherwise > could may cause scripts to fail when run in a container. Makes the > containers less transparent. While my goal next week is to make containers more transparent, the official stance from kernel summit a few years ago was: transparent containers are not a valid goal (as seen from kernel). Not saying that what you're saying above is wrong, but I *do* argue that 'silently ignoring the write' is more wrong than refusing the write :) Fooling userspace is a lose, imo. Also, we can use a FUSE fs over proc to hide the files. Doing that now is insufficient because root in the container can just remount proc over the filter. But after user namespaces, root in the container has the choice of leaving the filter in place for the sake of his own usespace, or removing it and getting a bunch of files he can't use. ... > A heavily loaded system that goes deep into swap without triggering > the OOM killer can become pretty useless. My home laptop with 2 gigs Isn't a cgroup that controls both memory and swap access the right answer to this? (And do we have that now, btw?) (I'm doing too many things at once so probably not thinking this through enough) -serge _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers