Re: Why does devices cgroup check for CAP_SYS_ADMIN explicitly?

Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx> · Tue, 6 Nov 2012 09:01:32 -0600

Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx):
> Tejun Heo <tj@xxxxxxxxxx> writes:
> 
> > Hello, guys.
> >
> > Why doesn't it follow the usual security enforced by cgroupfs
> > permissions?  Why is the explicit check necessary?
> 
> An almost more interesting question is why is cgroup one of the last
> pieces of code not using capabilities and instead lets you attach to any
> process simply if your uid == 0.
> 
> I don't know the history but the device cgroup testing for CAP_SYS_ADMIN
> makes a naive sort of sense to me.

Right, it's possible to run a daemon with uid 0 but restricted
capabilities (enforced by bounding set) for all its children, but
not possible to run a non-uid-0 daemon with some capabilities
across execs.  (Of course that can be worked around with privileged
helpers or to an extent inherited capabilities.)  Capabilities make
sense for the cgroup controls.

In other words, any code which equates uid 0 with posession of a
capability is taking away from the usefulness of capabilities
everywhere.

More practically, lacking user namespaces you can create a full (i.e.
ubuntu) container that doesn't have CAP_SYS_ADMIN, but not one without
root.  So this allows you to prevent containers from bypassing devices
cgroup restrictions set by the parent.  (In reality we are not using
that in ubuntu - we grant CAP_SYS_ADMIN and use apparmor to restrict -
but other distros do.)

-serge
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/containers