On Tue, Apr 29, 2014 at 04:51:54PM -0700, Andy Lutomirski wrote: > On Tue, Apr 29, 2014 at 4:47 PM, Stéphane Graber <stgraber@xxxxxxxxxx> wrote: > > On Tue, Apr 29, 2014 at 04:22:55PM -0700, Andy Lutomirski wrote: > >> On Tue, Apr 29, 2014 at 4:20 PM, Marian Marinov <mm@xxxxxx> wrote: > >> > On 04/30/2014 01:45 AM, Andy Lutomirski wrote: > >> >> > >> >> On 04/29/2014 03:29 PM, Serge Hallyn wrote: > >> >>> > >> >>> Quoting Marian Marinov (mm-108MBtLGafw@xxxxxxxxxxxxxxxx): > >> >>>> > >> >>>> On 04/30/2014 01:02 AM, Serge Hallyn wrote: > >> >>>>> > >> >>>>> Quoting Marian Marinov (mm-108MBtLGafw@xxxxxxxxxxxxxxxx): > >> >>>>>> > >> >>>>>> On 04/29/2014 09:52 PM, Serge Hallyn wrote: > >> >>>>>>> > >> >>>>>>> Quoting Theodore Ts'o (tytso-3s7WtUTddSA@xxxxxxxxxxxxxxxx): > >> >>>>>>>> > >> >>>>>>>> On Tue, Apr 29, 2014 at 04:49:14PM +0300, Marian Marinov wrote: > >> >>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>>> I'm proposing a fix to this, by replacing the > >> >>>>>>>>> capable(CAP_LINUX_IMMUTABLE) > >> >>>>>>>>> check with ns_capable(current_cred()->user_ns, > >> >>>>>>>>> CAP_LINUX_IMMUTABLE). > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> Um, wouldn't it be better to simply fix the capable() function? > >> >>>>>>>> > >> >>>>>>>> /** > >> >>>>>>>> * capable - Determine if the current task has a superior > >> >>>>>>>> capability in effect > >> >>>>>>>> * @cap: The capability to be tested for > >> >>>>>>>> * > >> >>>>>>>> * Return true if the current task has the given superior > >> >>>>>>>> capability currently > >> >>>>>>>> * available for use, false if not. > >> >>>>>>>> * > >> >>>>>>>> * This sets PF_SUPERPRIV on the task if the capability is > >> >>>>>>>> available on the > >> >>>>>>>> * assumption that it's about to be used. > >> >>>>>>>> */ > >> >>>>>>>> bool capable(int cap) > >> >>>>>>>> { > >> >>>>>>>> return ns_capable(&init_user_ns, cap); > >> >>>>>>>> } > >> >>>>>>>> EXPORT_SYMBOL(capable); > >> >>>>>>>> > >> >>>>>>>> The documentation states that it is for "the current task", and I > >> >>>>>>>> can't imagine any use case, where user namespaces are in effect, > >> >>>>>>>> where > >> >>>>>>>> using init_user_ns would ever make sense. > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> the init_user_ns represents the user_ns owning the object, not the > >> >>>>>>> subject. > >> >>>>>>> > >> >>>>>>> The patch by Marian is wrong. Anyone can do 'clone(CLONE_NEWUSER)', > >> >>>>>>> setuid(0), execve, and end up satisfying > >> >>>>>>> 'ns_capable(current_cred()->userns, > >> >>>>>>> CAP_SYS_IMMUTABLE)' by definition. > >> >>>>>>> > >> >>>>>>> So NACK to that particular patch. I'm not sure, but IIUC it should > >> >>>>>>> be > >> >>>>>>> safe to check against the userns owning the inode? > >> >>>>>>> > >> >>>>>> > >> >>>>>> So what you are proposing is to replace > >> >>>>>> 'ns_capable(current_cred()->userns, CAP_SYS_IMMUTABLE)' with > >> >>>>>> 'inode_capable(inode, CAP_SYS_IMMUTABLE)' ? > >> >>>>>> > >> >>>>>> I agree that this is more sane. > >> >>>>> > >> >>>>> > >> >>>>> Right, and I think the two operations you're looking at seem sane > >> >>>>> to allow. > >> >>>> > >> >>>> > >> >>>> If you are ok with this patch, I will fix all file systems and send > >> >>>> patches. > >> >>> > >> >>> > >> >>> Sounds good, thanks. > >> >>> > >> >>>> Signed-off-by: Marian Marinov <mm-NV7Lj0SOnH0@xxxxxxxxxxxxxxxx> > >> >>> > >> >>> > >> >>> Acked-by: Serge E. Hallyn > >> >>> <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@xxxxxxxxxxxxxxxx> > >> >> > >> >> > >> >> Wait, what? > >> >> > >> >> Inodes aren't owned by user namespaces; they're owned by users. And any > >> >> user can arrange to have a user namespace in which they pass an > >> >> inode_capable check on any inode that they own. > >> >> > >> >> Presumably there's a reason that CAP_SYS_IMMUTABLE is needed. If this > >> >> gets merged, then it would be better to just drop CAP_SYS_IMMUTABLE > >> >> entirely. > >> > > >> > > >> > The problem I'm trying to solve is this: > >> > > >> > container with its own user namespace and CAP_SYS_IMMUTABLE should be able > >> > to use chattr on all files witch this container has access to. > >> > > >> > Unfortunately with the capable(CAP_SYS_IMMUTABLE) check this is not working. > >> > > >> > With the proposed two fixes CAP_SYS_IMMUTABLE started working in the > >> > container. > >> > > >> > The first solution got its user namespace from the currently running process > >> > and the second gets its user namespace from the currently opened inode. > >> > > >> > So what would be the best solution in this case? > >> > >> I'd suggest adding a mount option like fs_owner_uid that names a uid > >> that owns, in the sense of having unlimited access to, a filesystem. > >> Then anyone with caps on a namespace owned by that uid could do > >> whatever. > >> > >> Eric? > >> > >> --Andy > > > > The most obvious problem I can think of with "do whatever" is that this > > will likely include mknod of char and block devices which you can then > > chown/chmod as you wish and use to access any devices on the system from > > an unprivileged container. > > This can however be mitigated by using the devices cgroup controller. > > Or 'nodev'. setuid/setgid may have the same problem, too. > > Implementing something like this would also make CAP_DAC_READ_SEARCH > and CAP_DAC_OVERRIDE work. > > Arguably it should be impossible to mount such a thing in the first > place without global privilege. > > > > > You also probably wouldn't want any unprivileged user from the host to > > find a way to access that mounted filesytem but so long as you do the > > mount in a separate mountns and don't share uids between the host and > > the container, that should be fine too. > > This part should be a nonissue -- an unprivileged user who has the > right uid owns the namespace anyway, so this is the least of your > worries. > > --Andy It should be a nonissue so long as we make sure that a file owned by a uid outside the scope of the container may not be changed even though fs_owner_uid is set. Otherwise, it's just a matter of chmod +S on say a shell and anyone who can see the fs from the host will be getting a root shell (assuming said file is owned by the host's uid 0). So that's restricting slightly what "do whatever" would do in this case. -- Stéphane Graber Ubuntu developer http://www.ubuntu.com
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers