On Tue, Oct 21, 2014 at 02:27:13PM -0700, Andy Lutomirski wrote: > On Tue, Oct 21, 2014 at 2:21 PM, Seth Forshee > <seth.forshee@xxxxxxxxxxxxx> wrote: > > On Wed, Oct 15, 2014 at 07:37:36AM -0700, Andy Lutomirski wrote: > >> On Wed, Oct 15, 2014 at 12:39 AM, Seth Forshee > >> <seth.forshee@xxxxxxxxxxxxx> wrote: > >> > On Tue, Oct 14, 2014 at 02:19:19PM -0700, Andy Lutomirski wrote: > >> >> On Tue, Oct 14, 2014 at 2:13 PM, Eric W. Biederman > >> >> <ebiederm@xxxxxxxxxxxx> wrote: > >> >> > Seth Forshee <seth.forshee@xxxxxxxxxxxxx> writes: > >> >> > > >> >> >> On Tue, Oct 14, 2014 at 01:01:02PM -0700, Eric W. Biederman wrote: > >> >> >>> Michael j Theall <mtheall@xxxxxxxxxx> writes: > >> >> >>> > >> >> >>> > Seth Forshee <seth.forshee@xxxxxxxxxxxxx> wrote on 10/14/2014 09:25:55 AM: > >> >> >>> > > >> >> >>> >> From: Seth Forshee <seth.forshee@xxxxxxxxxxxxx> > >> >> >>> >> To: Miklos Szeredi <miklos@xxxxxxxxxx> > >> >> >>> >> Cc: fuse-devel@xxxxxxxxxxxxxxxxxxxxx, "Serge H. Hallyn" > >> >> >>> >> <serge.hallyn@xxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, Seth > >> >> >>> >> Forshee <seth.forshee@xxxxxxxxxxxxx>, "Eric W. Biederman" > >> >> >>> >> <ebiederm@xxxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx > >> >> >>> >> Date: 10/14/2014 09:27 AM > >> >> >>> >> Subject: [fuse-devel] [PATCH v4 4/5] fuse: Support privileged xattrs > >> >> >>> >> only with a mount option > >> >> >>> >> > >> >> >>> >> Allowing unprivileged users to provide arbitrary xattrs via fuse > >> >> >>> >> mounts bypasses the normal restrictions on setting xattrs. Such > >> >> >>> >> mounts should be restricted to reading and writing xattrs in the > >> >> >>> >> user.* namespace. > >> >> >>> >> > >> >> >>> > > >> >> >>> > Can you explain how the normal restrictions on setting xattrs are > >> >> >>> > bypassed? > >> >> >>> > >> >> >>> If the fuse server is not run by root. Which is a large part of the > >> >> >>> point of fuse. > >> >> >> > >> >> >> So the server could for example return trusted.* xattrs which were not > >> >> >> set by a privileged user. > >> >> >> > >> >> >>> > My filesystem still needs security.* and system.*, and it looks like > >> >> >>> > xattr_permission already prevents non-privileged users from accessing > >> >> >>> > trusted.* > >> >> >>> > >> >> >>> If the filesystem is mounted with nosuid (typical of a non-privileged > >> >> >>> mount of fuse) then the security.* attributes are ignored. > >> >> >> > >> >> >> That I wasn't aware of. In fact I still haven't found where this > >> >> >> restriction is implemented. > >> >> > > >> >> > My memory may be have been incomplete. What I was thinking of is > >> >> > security/commoncap.c the MNT_NOSUID check in get_file_caps. > >> >> > > >> >> > Upon inspection that appears limited to file capabilities, and while > >> >> > there are a few other MNT_NOSUID checks under security the feel far from > >> >> > complete. > >> >> > > >> >> > Sigh. > >> >> > > >> >> > This deserves a hard look because if MNT_NOSUID is not sufficient than > >> >> > it may be possible for me to insert a usb stick with an extN filesystem > >> >> > with the right labels having it auto-mounted nosuid and subvert the > >> >> > security of something like selinux. > >> >> > >> >> It's this code in selinux/hooks.c: > >> >> > >> >> if ((bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) || > >> >> (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)) > >> >> new_tsec->sid = old_tsec->sid; > >> >> > >> >> > >> >> One might argue that this should actually generate -EPERM instead of > >> >> ignoring the label, but it should be safe against untrusted media. > >> >> > >> >> > > >> >> >> Nonetheless, a userns mount could be done without nosuid (though that > >> >> >> mount will also be unaccessible outside of that namespace). > >> >> >> > >> >> >>> >> It's difficult though to tell whether a mount is being performed > >> >> >>> >> on behalf of an unprivileged user since fuse mounts are ususally > >> >> >>> >> done via a suid root helper. Thus a new mount option, > >> >> >>> >> privileged_xattrs, is added to indicated that xattrs from other > >> >> >>> >> namespaces are allowed. This option can only be supplied by > >> >> >>> >> system-wide root; supplying the option as an unprivileged user > >> >> >>> >> will cause the mount to fail. > >> >> >>> > > >> >> >>> > I can't say I'm convinced that this is the right direction to head. > >> >> >>> > >> >> >>> With respect to defaults we could keep the current default if you > >> >> >>> have the global CAP_SYS_ADMIN privilege when the mount takes place > >> >> >>> and then avoid breaking anything. > >> >> >> > >> >> >> Except that unprivileged mounts are normally done by a suid root helper, > >> >> >> which is why I've required both global CAP_SYS_ADMIN and a mount option > >> >> >> to get the current default behavior. > >> >> > > >> >> > If nosuid is sufficient that may break existing setups for no good > >> >> > reason. > >> >> > > >> >> > Shrug. I won't have much time for a bit but I figured I would highlight > >> >> > the potential security hole in existing setups. So someone with time > >> >> > this week can look at that. > >> >> > >> >> I think I have a better solution. I'll send it out. > >> > > >> > To be honest I don't understand enough about how selinux uses security > >> > labels to know what level of paranoia is appropriate, so I wrote this > >> > out of an excess of paranoia. If the patch you sent restricts things > >> > sufficiently then I'm perfectly happy to see this patch dropped. > >> > > >> > And really with fuse all of this is really excess paranoia because (if > >> > my other patches are applied at least) the fuse mount will be > >> > inaccessible to any user outside the user namespace from which it was > >> > mounted or its descendants. > >> > > >> > >> I missed the rest of the series. This is exciting! > >> > >> I'm not sure that the other protections you have are quite sufficient, > >> though, without something like my patch. I'll comment on the rest. > > > > I still suspect we should be doing more to limit xattrs from userns > > mounts, since normally only root is allowed to set trusted.* and > > security.* xattrs. Seems like this should be done more generally though > > and not just specific to fuse. Something like this maybe? It probably > > won't matter much for fuse mounts since they won't be accessible outside > > the userns which did the mount, but for other filesystems the xattrs > > could be set externally and injected into the system via a userns mount. > > > > > > diff --git a/fs/super.c b/fs/super.c > > index eae088f6aaae..499cd7d2d2f8 100644 > > --- a/fs/super.c > > +++ b/fs/super.c > > @@ -149,6 +149,7 @@ static void destroy_super(struct super_block *s) > > percpu_counter_destroy(&s->s_writers.counter[i]); > > security_sb_free(s); > > WARN_ON(!list_empty(&s->s_mounts)); > > + put_user_ns(s->s_user_ns); > > kfree(s->s_subtype); > > kfree(s->s_options); > > kfree_rcu(s, rcu); > > @@ -230,6 +231,8 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags) > > s->s_shrink.count_objects = super_cache_count; > > s->s_shrink.batch = 1024; > > s->s_shrink.flags = SHRINKER_NUMA_AWARE; > > + > > + s->s_user_ns = get_user_ns(&init_user_ns); > > Huh? I think I like this in principle, but shouldn't this be the > actual userns doing the mount? Probably, or else the fs should change it. The reason I'm not sure yet is that I also started poking at adding userns support to ext4 the other day, and for that I'm using s_user_ns to do the translations in i_[ug]id_(read|write) and I still need to verify that it won't break anything for other filesystems that support userns mounts. But you're right; as I've shown it here the changes are ineffective. > > > return s; > > > > fail: > > diff --git a/fs/xattr.c b/fs/xattr.c > > index 64e83efb742d..383bb9f25555 100644 > > --- a/fs/xattr.c > > +++ b/fs/xattr.c > > @@ -40,6 +40,12 @@ xattr_permission(struct inode *inode, const char *name, int mask) > > return -EPERM; > > } > > > > + /* Restrict security.* and trusted.* to mounts from init_user_ns. */ > > + if (inode->i_sb->s_user_ns != &init_user_ns && > > + (!strcmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN) || > > + !strcmp(name, XATTR_TRUSTED_PREFIX, XATTR_TRUSTED_PREFIX_LEN))) > > + return -EPERM; > > + > > trusted.* should be fine already, I think -- it checks global > capabilities. And I still think that security.* should be left to > LSMs, which IMO really do need to be fixed for user namespaces. > > But how does this help with FUSE at all? Does FUSE end up calling > xattr_permission? It gets called from vfs_getxattr, and thus for the getxattr syscall for all fs types, so this would block reading any trusted.* xattrs from the fuse userspace process. But like I said before, the access restrictions that are in place should prevent this from really being a problem, so these changes could probably wait. The one thing it would change is that if we have s_user_ns in the superblock I'd probably make fuse use that instead of storing it in fs-internal data, but that can always be changed later. Thanks, Seth -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html