On Sat, Oct 18, 2014 at 10:23 PM, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > "Serge E. Hallyn" <serge@xxxxxxxxxx> writes: > >> Quoting Aditya Kali (adityakali@xxxxxxxxxx): >>> On Thu, Oct 16, 2014 at 2:12 PM, Serge E. Hallyn <serge@xxxxxxxxxx> wrote: >>> > Quoting Aditya Kali (adityakali@xxxxxxxxxx): >>> >> setns on a cgroup namespace is allowed only if >>> >> * task has CAP_SYS_ADMIN in its current user-namespace and >>> >> over the user-namespace associated with target cgroupns. >>> >> * task's current cgroup is descendent of the target cgroupns-root >>> >> cgroup. >>> > >>> > What is the point of this? >>> > >>> > If I'm a user logged into >>> > /lxc/c1/user.slice/user-1000.slice/session-c12.scope and I start >>> > a container which is in >>> > /lxc/c1/user.slice/user-1000.slice/session-c12.scope/x1 >>> > then I will want to be able to enter the container's cgroup. >>> > The container's cgroup root is under my own (satisfying the >>> > below condition0 but my cgroup is not a descendent of the >>> > container's cgroup. >>> > >>> This condition is there because we don't want to do implicit cgroup >>> changes when a process attaches to another cgroupns. cgroupns tries to >>> preserve the invariant that at any point, your current cgroup is >>> always under the cgroupns-root of your cgroup namespace. But in your >>> example, if we allow a process in "session-c12.scope" container to >>> attach to cgroupns root'ed at "session-c12.scope/x1" container >>> (without implicitly moving its cgroup), then this invariant won't >>> hold. >> >> Oh, I see. Guess that should be workable. Thanks. > > Which has me looking at what the rules are for moving through > the cgroup hierarchy. > > As long as we have write access to cgroup.procs and are allowed > to open the file for write, we can move any of our own tasks > into the cgroup. So the cgroup namespace rules don't seem > to be a problem. > > Andy can you please take a look at the permission checks in > __cgroup_procs_write. The actual requirements for calling that function haven't changed, right? IOW, what does this have to do with cgroupns? Is the idea that you want a privileged user wrt a cgroupns's userns to be able to use this? If so: Yes, that current_cred() thing is bogus. (Actually, this is probably exploitable right now if any cgroup.procs inode anywhere on the system lets non-root write.) (Can we have some kernel debugging option that makes any use of current_cred() in write(2) warn?) We really need a weaker version of may_ptrace for this kind of stuff. Maybe the existing may_ptrace stuff is okay, actually. But this is completely missing group checks, cap checks, capabilities wrt the userns, etc. Also, I think that, if this version of the patchset allows non-init userns to unshare cgroupns, then the issue of what permission is needed to lock the cgroup hierarchy like that needs to be addressed, because unshare(CLONE_NEWUSER|CLONE_NEWCGROUP) will effectively pin the calling task with no permission required. Bolting on a fix later will be a mess. --Andy > > As I read the code I see 3 security gaffaws in the permssion check. > - Using current->cred instead of file->f_cred. > - Not checking tcred->euid. > - Checking GLOBAL_ROOT_UID instead of having a capable call. > > The file permission on cgroup.procs seem just sufficient to keep > to keep those bugs from being easily exploitable. > > Eric -- Andy Lutomirski AMA Capital Management, LLC _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers