On Tue, Nov 18, 2014 at 9:13 AM, Seth Forshee <seth.forshee@xxxxxxxxxxxxx> wrote: > On Tue, Nov 18, 2014 at 09:09:34AM -0800, Andy Lutomirski wrote: >> On Tue, Nov 18, 2014 at 7:21 AM, Seth Forshee >> <seth.forshee@xxxxxxxxxxxxx> wrote: >> > On Wed, Nov 12, 2014 at 10:22:54AM -0600, Seth Forshee wrote: >> >> On Wed, Nov 12, 2014 at 02:09:15PM +0100, Miklos Szeredi wrote: >> >> > On Tue, Nov 11, 2014 at 09:37:10AM -0600, Eric W. Biederman wrote: >> >> > >> >> > > > Maybe I'm being dense, but can someone give a concrete example of such an >> >> > > > attack? >> >> > > >> >> > > There are two variants of things at play here. >> >> > > >> >> > > There is the classic if you don't freeze your context at open time when >> >> > > you pass that file descriptor to another process unexpected things can >> >> > > happen. >> >> > > >> >> > > An essentially harmless but extremely confusing example is what happens >> >> > > to a partial read when it stops halfway through a uid value and the next >> >> > > read on the same file descriptor is from a process in a different user >> >> > > namespace. Which uid value should be returned to userspace. >> >> > >> >> > Fuse device doesn't currently do partial reads, so that's a non-issue. >> >> > >> >> > > Now if I am in a nefarious mood I can create a unprivileged user >> >> > > namespace, open /dev/fuse and mount a fuse filesystem. Pass the file >> >> > > descriptor to /dev/fuse to a processes that is in the default user >> >> > > namespace (and thus can use any uid/gid). With that file desctipor >> >> > > report that there is a setuid 0 exectuable on that file system. >> >> > >> >> > Yes, and this would also be prevented by MNT_NOSUID, which would be a good idea >> >> > anyway. I just don't see the reason we'd want to allow clearing MNT_NOSUID in a >> >> > private namespace. >> >> > >> >> > So we don't currently see a use case for relaxing either the MNT_NOSUID >> >> > restriction or for relaxing the requirement on the user namespace the fuse >> >> > server is in. Is that correct? >> >> > >> >> > If so, we should leave both restrictions in place since that allows the greatest >> >> > flexibility in the future, is either of those needs to be relaxed. >> >> >> >> I'm not aware of specific use cases for either at this point. However, >> >> Andy's patch [1] will limit suid to the set of namespaces where the user >> >> who mounted the filesystem already has privileges. Enforcing MNT_NOSUID >> >> will require enforcement in the vfs, and in that case we definitely need >> >> to decide whether the policy is to implicitly add the flag or fail the >> >> mount attempt if the flag is not present [2]. >> > >> > I asked around a bit, and it turns out there are use cases for nested >> > containers (i.e. a container within a container) where the rootfs for >> > the outer container mounts a filesystem containing the rootfs for the >> > inner container. If that mount is nosuid then suid utilities like ping >> > aren't going to work in the inner container. >> > >> > So since there's a use case for suid in a userns mount and we have what >> > we belive are sufficient protections against using this as a vector to >> > get privileges outside the container, I'm planning to move ahead without >> > the MNT_NOSUID restriction. Any objections? >> >> Are you talking about MNT_NOSUID the flag or my ns-dependent thing? > > I'm talking about dropping the proposed requirement from Miklos that all > fuse userns mounts are required to have the MNT_NOSUID flag. I intend to > keep your ns-dependent thing. > In that case, I agree completely. There are certainly uses for non-nosuid mounts in containers, and I don't see why fuse should be any different. --Andy > Thanks, > Seth > -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html