Re: [fuse-devel] [PATCH v4 4/5] fuse: Support privileged xattrs only with a mount option

Seth Forshee <seth.forshee@xxxxxxxxxxxxx> · Tue, 21 Oct 2014 23:58:48 -0500

On Tue, Oct 21, 2014 at 02:27:13PM -0700, Andy Lutomirski wrote:
> On Tue, Oct 21, 2014 at 2:21 PM, Seth Forshee
> <seth.forshee@xxxxxxxxxxxxx> wrote:
> > On Wed, Oct 15, 2014 at 07:37:36AM -0700, Andy Lutomirski wrote:
> >> On Wed, Oct 15, 2014 at 12:39 AM, Seth Forshee
> >> <seth.forshee@xxxxxxxxxxxxx> wrote:
> >> > On Tue, Oct 14, 2014 at 02:19:19PM -0700, Andy Lutomirski wrote:
> >> >> On Tue, Oct 14, 2014 at 2:13 PM, Eric W. Biederman
> >> >> <ebiederm@xxxxxxxxxxxx> wrote:
> >> >> > Seth Forshee <seth.forshee@xxxxxxxxxxxxx> writes:
> >> >> >
> >> >> >> On Tue, Oct 14, 2014 at 01:01:02PM -0700, Eric W. Biederman wrote:
> >> >> >>> Michael j Theall <mtheall@xxxxxxxxxx> writes:
> >> >> >>>
> >> >> >>> > Seth Forshee <seth.forshee@xxxxxxxxxxxxx> wrote on 10/14/2014 09:25:55 AM:
> >> >> >>> >
> >> >> >>> >> From: Seth Forshee <seth.forshee@xxxxxxxxxxxxx>
> >> >> >>> >> To: Miklos Szeredi <miklos@xxxxxxxxxx>
> >> >> >>> >> Cc: fuse-devel@xxxxxxxxxxxxxxxxxxxxx, "Serge H. Hallyn"
> >> >> >>> >> <serge.hallyn@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, Seth
> >> >> >>> >> Forshee <seth.forshee@xxxxxxxxxxxxx>, "Eric W. Biederman"
> >> >> >>> >> <ebiederm@xxxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx
> >> >> >>> >> Date: 10/14/2014 09:27 AM
> >> >> >>> >> Subject: [fuse-devel] [PATCH v4 4/5] fuse: Support privileged xattrs
> >> >> >>> >> only with a mount option
> >> >> >>> >>
> >> >> >>> >> Allowing unprivileged users to provide arbitrary xattrs via fuse
> >> >> >>> >> mounts bypasses the normal restrictions on setting xattrs. Such
> >> >> >>> >> mounts should be restricted to reading and writing xattrs in the
> >> >> >>> >> user.* namespace.
> >> >> >>> >>
> >> >> >>> >
> >> >> >>> > Can you explain how the normal restrictions on setting xattrs are
> >> >> >>> > bypassed?
> >> >> >>>
> >> >> >>> If the fuse server is not run by root.  Which is a large part of the
> >> >> >>> point of fuse.
> >> >> >>
> >> >> >> So the server could for example return trusted.* xattrs which were not
> >> >> >> set by a privileged user.
> >> >> >>
> >> >> >>> > My filesystem still needs security.* and system.*, and it looks like
> >> >> >>> > xattr_permission already prevents non-privileged users from accessing
> >> >> >>> > trusted.*
> >> >> >>>
> >> >> >>> If the filesystem is mounted with nosuid (typical of a non-privileged
> >> >> >>> mount of fuse) then the security.* attributes are ignored.
> >> >> >>
> >> >> >> That I wasn't aware of. In fact I still haven't found where this
> >> >> >> restriction is implemented.
> >> >> >
> >> >> > My memory may be have been incomplete.  What I was thinking of is
> >> >> > security/commoncap.c the MNT_NOSUID check in get_file_caps.
> >> >> >
> >> >> > Upon inspection that appears limited to file capabilities, and while
> >> >> > there are a few other MNT_NOSUID checks under security the feel far from
> >> >> > complete.
> >> >> >
> >> >> > Sigh.
> >> >> >
> >> >> > This deserves a hard look because if MNT_NOSUID is not sufficient than
> >> >> > it may be possible for me to insert a usb stick with an extN filesystem
> >> >> > with the right labels having it auto-mounted nosuid and subvert the
> >> >> > security of something like selinux.
> >> >>
> >> >> It's this code in selinux/hooks.c:
> >> >>
> >> >>     if ((bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) ||
> >> >>         (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS))
> >> >>         new_tsec->sid = old_tsec->sid;
> >> >>
> >> >>
> >> >> One might argue that this should actually generate -EPERM instead of
> >> >> ignoring the label, but it should be safe against untrusted media.
> >> >>
> >> >> >
> >> >> >> Nonetheless, a userns mount could be done without nosuid (though that
> >> >> >> mount will also be unaccessible outside of that namespace).
> >> >> >>
> >> >> >>> >> It's difficult though to tell whether a mount is being performed
> >> >> >>> >> on behalf of an unprivileged user since fuse mounts are ususally
> >> >> >>> >> done via a suid root helper. Thus a new mount option,
> >> >> >>> >> privileged_xattrs, is added to indicated that xattrs from other
> >> >> >>> >> namespaces are allowed. This option can only be supplied by
> >> >> >>> >> system-wide root; supplying the option as an unprivileged user
> >> >> >>> >> will cause the mount to fail.
> >> >> >>> >
> >> >> >>> > I can't say I'm convinced that this is the right direction to head.
> >> >> >>>
> >> >> >>> With respect to defaults we could keep the current default if you
> >> >> >>> have the global CAP_SYS_ADMIN privilege when the mount takes place
> >> >> >>> and then avoid breaking anything.
> >> >> >>
> >> >> >> Except that unprivileged mounts are normally done by a suid root helper,
> >> >> >> which is why I've required both global CAP_SYS_ADMIN and a mount option
> >> >> >> to get the current default behavior.
> >> >> >
> >> >> > If nosuid is sufficient that may break existing setups for no good
> >> >> > reason.
> >> >> >
> >> >> > Shrug.  I won't have much time for a bit but I figured I would highlight
> >> >> > the potential security hole in existing setups.  So someone with time
> >> >> > this week can look at that.
> >> >>
> >> >> I think I have a better solution.  I'll send it out.
> >> >
> >> > To be honest I don't understand enough about how selinux uses security
> >> > labels to know what level of paranoia is appropriate, so I wrote this
> >> > out of an excess of paranoia. If the patch you sent restricts things
> >> > sufficiently then I'm perfectly happy to see this patch dropped.
> >> >
> >> > And really with fuse all of this is really excess paranoia because (if
> >> > my other patches are applied at least) the fuse mount will be
> >> > inaccessible to any user outside the user namespace from which it was
> >> > mounted or its descendants.
> >> >
> >>
> >> I missed the rest of the series.  This is exciting!
> >>
> >> I'm not sure that the other protections you have are quite sufficient,
> >> though, without something like my patch.  I'll comment on the rest.
> >
> > I still suspect we should be doing more to limit xattrs from userns
> > mounts, since normally only root is allowed to set trusted.* and
> > security.* xattrs. Seems like this should be done more generally though
> > and not just specific to fuse. Something like this maybe? It probably
> > won't matter much for fuse mounts since they won't be accessible outside
> > the userns which did the mount, but for other filesystems the xattrs
> > could be set externally and injected into the system via a userns mount.
> >
> >
> > diff --git a/fs/super.c b/fs/super.c
> > index eae088f6aaae..499cd7d2d2f8 100644
> > --- a/fs/super.c
> > +++ b/fs/super.c
> > @@ -149,6 +149,7 @@ static void destroy_super(struct super_block *s)
> >                 percpu_counter_destroy(&s->s_writers.counter[i]);
> >         security_sb_free(s);
> >         WARN_ON(!list_empty(&s->s_mounts));
> > +       put_user_ns(s->s_user_ns);
> >         kfree(s->s_subtype);
> >         kfree(s->s_options);
> >         kfree_rcu(s, rcu);
> > @@ -230,6 +231,8 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
> >         s->s_shrink.count_objects = super_cache_count;
> >         s->s_shrink.batch = 1024;
> >         s->s_shrink.flags = SHRINKER_NUMA_AWARE;
> > +
> > +       s->s_user_ns = get_user_ns(&init_user_ns);
> 
> Huh?  I think I like this in principle, but shouldn't this be the
> actual userns doing the mount?

Probably, or else the fs should change it. The reason I'm not sure yet
is that I also started poking at adding userns support to ext4 the other
day, and for that I'm using s_user_ns to do the translations in
i_[ug]id_(read|write) and I still need to verify that it won't break
anything for other filesystems that support userns mounts. But you're
right; as I've shown it here the changes are ineffective.

> 
> >         return s;
> >
> >  fail:
> > diff --git a/fs/xattr.c b/fs/xattr.c
> > index 64e83efb742d..383bb9f25555 100644
> > --- a/fs/xattr.c
> > +++ b/fs/xattr.c
> > @@ -40,6 +40,12 @@ xattr_permission(struct inode *inode, const char *name, int mask)
> >                         return -EPERM;
> >         }
> >
> > +       /* Restrict security.* and trusted.* to mounts from init_user_ns. */
> > +       if (inode->i_sb->s_user_ns != &init_user_ns &&
> > +           (!strcmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN) ||
> > +            !strcmp(name, XATTR_TRUSTED_PREFIX, XATTR_TRUSTED_PREFIX_LEN)))
> > +               return -EPERM;
> > +
> 
> trusted.* should be fine already, I think -- it checks global
> capabilities.  And I still think that security.* should be left to
> LSMs, which IMO really do need to be fixed for user namespaces.
> 
> But how does this help with FUSE at all?   Does FUSE end up calling
> xattr_permission?

It gets called from vfs_getxattr, and thus for the getxattr syscall for
all fs types, so this would block reading any trusted.* xattrs from the
fuse userspace process.

But like I said before, the access restrictions that are in place should
prevent this from really being a problem, so these changes could
probably wait. The one thing it would change is that if we have
s_user_ns in the superblock I'd probably make fuse use that instead of
storing it in fs-internal data, but that can always be changed later.

Thanks,
Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html