Re: Why does devices cgroup check for CAP_SYS_ADMIN explicitly?

Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx> · Tue, 6 Nov 2012 09:30:32 -0600

Quoting Tejun Heo (tj@xxxxxxxxxx):
> Hello, Serge.
> 
> On Tue, Nov 06, 2012 at 09:01:32AM -0600, Serge Hallyn wrote:
> > More practically, lacking user namespaces you can create a full (i.e.
> > ubuntu) container that doesn't have CAP_SYS_ADMIN, but not one without
> > root.  So this allows you to prevent containers from bypassing devices
> > cgroup restrictions set by the parent.  (In reality we are not using
> > that in ubuntu - we grant CAP_SYS_ADMIN and use apparmor to restrict -
> > but other distros do.)
> 
> Do you even mount cgroupfs in containers?  If you just bind-mount
> cgroupfs verbatim in containers, I don't think that's gonna work very
> well.  If not, all this doesn't make any difference for containers.

I don't know if those who restrict CAP_SYS_ADMIN do so or not.  We by
default do not.

(It's not relevant for this discussion as again we use apparmor to deny
writes, but we *do* optionally bind mount cgroups into the containers,
mounting /sys/fs/cgroup/$cgroup/lxc/$containername/$containername.real
on the host to /sys/fs/cgroup/$cgroup in the container for each cgroup.)

> So, you don't really have any actual use case for the explicit CAP_*
> checks, right?

No, especially since we will now have user namespaces.

We will want to be able to strictly enforce hierarchical limits - i.e.
allow uid 100000 (which is uid 0 in the container) to change cgroup
settings, but never exceed limits set on the parent directory.  IIUC
you are working toward anyway with the general hierarchy work? (thanks
for that).

-serge
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/containers