Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx): > Tejun Heo <tj@xxxxxxxxxx> writes: > > > Hello, guys. > > > > Why doesn't it follow the usual security enforced by cgroupfs > > permissions? Why is the explicit check necessary? > > An almost more interesting question is why is cgroup one of the last > pieces of code not using capabilities and instead lets you attach to any > process simply if your uid == 0. > > I don't know the history but the device cgroup testing for CAP_SYS_ADMIN > makes a naive sort of sense to me. Right, it's possible to run a daemon with uid 0 but restricted capabilities (enforced by bounding set) for all its children, but not possible to run a non-uid-0 daemon with some capabilities across execs. (Of course that can be worked around with privileged helpers or to an extent inherited capabilities.) Capabilities make sense for the cgroup controls. In other words, any code which equates uid 0 with posession of a capability is taking away from the usefulness of capabilities everywhere. More practically, lacking user namespaces you can create a full (i.e. ubuntu) container that doesn't have CAP_SYS_ADMIN, but not one without root. So this allows you to prevent containers from bypassing devices cgroup restrictions set by the parent. (In reality we are not using that in ubuntu - we grant CAP_SYS_ADMIN and use apparmor to restrict - but other distros do.) -serge _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers