Re: [PATCH v5 2/4] fuse: Support fuse filesystems outside of init_user_ns

Seth Forshee <seth.forshee@xxxxxxxxxxxxx> · Fri, 21 Nov 2014 12:27:31 -0600

On Fri, Nov 21, 2014 at 12:14:19PM -0600, Eric W. Biederman wrote:
> Seth Forshee <seth.forshee@xxxxxxxxxxxxx> writes:
> 
> > On Wed, Nov 19, 2014 at 03:09:11PM +0100, Serge E. Hallyn wrote:
> >> Quoting Miklos Szeredi (miklos@xxxxxxxxxx):
> >> > On Wed, Nov 19, 2014 at 9:50 AM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >> > > On Tue, Nov 18, 2014 at 4:21 PM, Seth Forshee
> >> > > <seth.forshee@xxxxxxxxxxxxx> wrote:
> >> > >>> I asked around a bit, and it turns out there are use cases for nested
> >> > >> containers (i.e. a container within a container) where the rootfs for
> >> > >> the outer container mounts a filesystem containing the rootfs for the
> >> > >> inner container. If that mount is nosuid then suid utilities like ping
> >> > >> aren't going to work in the inner container.
> >> > >>
> >> > >> So since there's a use case for suid in a userns mount and we have what
> >> > >> we belive are sufficient protections against using this as a vector to
> >> > >> get privileges outside the container, I'm planning to move ahead without
> >> > >> the MNT_NOSUID restriction. Any objections?
> >> > >
> >> > > In the general case how'd we prevent suid executable being tricked to
> >> > > do something it shouldn't do by unprivileged mounting into sensitive
> >> > > places (i.e. config files) inside the container?
> >> 
> >> The design of the namespaces would prevent that.  You cannot manipulate your
> >> mounts namespace unless you own it.  You cannot manipulate the mounts namespace
> >> for a task whose user namespace you do not own.  If you can, for instance,
> >> bind mount $HOME/shadow onto /etc/shadow, then you already own your user
> >> namespace and are root there, so any suid-root program which you mount through
> >> fuse will only subjegate your own namespace.  Any task which running in the
> >> parent user-ns (and therefore parent mount-ns) will not see your bind mount.
> >> 
> >> > > Allowing SUID looks like a slippery slope to me.  And there are plenty
> >> > > of solutions to the "ping" problem, AFAICS, that don't involve the
> >> > > suid bit.
> >> > 
> >> > ping isn't even suid on my system, it has security.capability xattr instead.
> >> 
> >> security.capability xattrs that will have the exact same concerns wrt
> >> confusion through bind mounts as suid.
> >> 
> >> > Please just get rid of SUID/SGID.  It's a legacy, it's a hack, not
> >> > worth the complexity and potential problems arising from that
> >> > complexity.
> >> 
> >> Oh boy, I don't know which side to sit on here :)  I'm all for replacing
> >> suid with some use of file capabilities, but realistically there are reasons
> >> why that hasn't happened more widely than it has - tar, package managers,
> >> cpio, nfs, etc.
> >
> > Miklos: I we're all generally in agreement here that suid/sgid is not
> > the best solution, but as Serge points out we are unfortunately not yet
> > in a place where it can be completely dropped in favor of capabilities.
> > In light of this can I convince you to reconsider your position?
> 
> Regardless of what fuse does user namespaces must support mounting
> filesystems that have the setuid and setgid bits set.  Likewise we need
> to handle capabilities.
> 
> There is a parallel bit of work to the fuse patches that I think at this
> point should be completed first.
> 
> - Add s_user_ns to struct super.  So we can have filesystems whose
>   labels are not interpreted at a global scope.
> 
> - Tweak the file capability code to look at s_user_ns and treat it
>   properly.
> 
> - Tweak the lsms to look at s_user_ns and ignore security labels that
>   don't come from init_user_ns.  (The lsms at their discrection can
>   be more trusting but the default should be for them to ignore those
>   labels).
> 
> - Tweak the security checks to allow setting file capabilities and
>   other security xattrs if we have the appropriate capabilities in
>   s_user_ns.
> 
> - Update tmpfs and ramfs to set s_user_ns when being mounted.

Okay. Is someone already working on these items? If not I'll work up
some patches.

> When those bits are done we can tweak the fuse patches to also set
> s_user_ns.
> 
> As for MNT_NO_SUID if fuse wants to enforce that in some way.  I don't
> particularly care, but I don't think that makes sense as a vfs property.

I don't think fuse can enforce this since the flag applies to the mount
and not the super block. It would have to be enforced by the vfs.

Thanks,
Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html