Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: > On Mon, Oct 20, 2014 at 9:49 PM, Eric W. Biederman > <ebiederm@xxxxxxxxxxxx> wrote: >> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: >> >>> On Sun, Oct 19, 2014 at 9:55 PM, Eric W.Biederman <ebiederm@xxxxxxxxxxxx> wrote: >>>> >>>> >>>> On October 19, 2014 1:26:29 PM CDT, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: >> >>>>> Is the idea >>>>>that you want a privileged user wrt a cgroupns's userns to be able to >>>>>use this? If so: >>>>> >>>>>Yes, that current_cred() thing is bogus. (Actually, this is probably >>>>>exploitable right now if any cgroup.procs inode anywhere on the system >>>>>lets non-root write.) (Can we have some kernel debugging option that >>>>>makes any use of current_cred() in write(2) warn?) >>>>> >>>>>We really need a weaker version of may_ptrace for this kind of stuff. >>>>>Maybe the existing may_ptrace stuff is okay, actually. But this is >>>>>completely missing group checks, cap checks, capabilities wrt the >>>>>userns, etc. >>>>> >>>>>Also, I think that, if this version of the patchset allows non-init >>>>>userns to unshare cgroupns, then the issue of what permission is >>>>>needed to lock the cgroup hierarchy like that needs to be addressed, >>>>>because unshare(CLONE_NEWUSER|CLONE_NEWCGROUP) will effectively pin >>>>>the calling task with no permission required. Bolting on a fix later >>>>>will be a mess. >>>> >>>> I imagine the pinning would be like the userns. >>>> >>>> Ah but there is a potentially serious issue with the pinning. >>>> With pinning we can make it impossible for root to move us to a different cgroup. >>>> >>>> I am not certain how serious that is but it bears thinking about. >>>> If we don't implement pinning we should be able to implent everything with just filesystem mount options, and no new namespace required. >>>> >>>> Sigh. >>>> >>>> I am too tired tonight to see the end game in this. >>> >>> Possible solution: >>> >>> Ditch the pinning. That is, if you're outside a cgroupns (or you have >>> a non-ns-confined cgroupfs mounted), then you can move a task in a >>> cgroupns outside of its root cgroup. If you do this, then the task >>> thinks its cgroup is something like "../foo" or "../../foo". >> >> Of the possible solutions that seems attractive to me, simply because >> we sometimes want to allow clever things to occur. >> >> Does anyone know of a reason (beyond pretty printing) why we need >> cgroupns to restrict the subset of cgroups processes can be in? >> >> I would expect permissions on the cgroup directories themselves, and >> limited visiblilty would be (in general) to achieve the desired >> visiblity. > > This makes the security impact of cgroupns very easy to understand, > right? Because there really won't be any -- cgroupns only affects > reads from /proc and what cgroupfs shows, but it doesn't change any > actual cgroups, nor does it affect any cgroup *changes*. It seems like what we have described is chcgrouproot aka chroot for cgroups. At which point I think there are potentially similar security issues as for chroot. Can we confuse a setuid root process if we make it's cgroup names look different. Of course the confusing root concern is handled by the usual namespace security checks that are already present. I do wonder if we think of this as chcgrouproot if there is a simpler implementation. >>> While we're at it, consider making setns for a cgroupns *not* change >>> the caller's cgroup. Is there any reason it really needs to? >> >> setns doesn't but nsenter is going to need to change the cgroup >> if the pinning requirement is kept. nsenenter is going to want to >> change the cgroup if the pinning requirement is dropped. >> > > It seems easy enough for nsenter to change the cgroup all by itself. Again. I don't think anyone has suggested or implemented anything different. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html