Miklos Szeredi <miklos@xxxxxxxxxx> writes: > On Tue, Sep 23, 2014 at 6:26 PM, Seth Forshee > <seth.forshee@xxxxxxxxxxxxx> wrote: >> On Tue, Sep 23, 2014 at 06:07:35PM +0200, Miklos Szeredi wrote: >>> On Tue, Sep 2, 2014 at 5:44 PM, Seth Forshee <seth.forshee@xxxxxxxxxxxxx> wrote: >>> > Here's an updated set of patches for allowing fuse mounts from pid and >>> > user namespaces. I discussed some of the issues we debated with the last >>> > patch set (and a few others) with Eric at LinuxCon, and the updates here >>> > mainly reflect the outcome of those discussions. >>> > >>> > The stickiest issue in the v1 patches was the question of where to get >>> > the user and pid namespaces from that are used for translating ids for >>> > communication with userspace. Eric told me that for user namespaces at >>> > least we need to grab a namespace at open or mount time and use only >>> > that namespace to prevent certain types of attacks. >>> >>> I'm not convinced. Let us have the gory details, please. >> >> I'll do my best, and hopefully Eric will fill in any details I miss. >> >> I think there may have been more than one possible scenario that Eric >> was describing to me, but this is the one I remember. A user could >> create a namespace and mount a fuse filesystem without nosuid. It could >> then pass the /dev/fuse fd to a process in init_user_ns, which could >> expose a suid file owned by root (or any other user) and use it to gain >> elevated privileges. >> >> On the other hand, if file ownership is always interpreted in the >> context of the namespace from which the filesystem is mounted then suid >> can only be used to become another uid already under that user's >> control. > > Much simpler solution: don't allow SUID in unprivileged namespaces. > SUID is a really ugly hack with many problems, just get rid of it. Except that doesn't solve the problem. The core problem is how do we avoid allowing letting a processes implementing a fuse filesystem to manipulate a process with privileges that it does not. The classic fuse solution to this is to only allow a single uid to access the fuse filesystem. With user namespaces we can relax that restriction when mounted with just user namespace root permissions to allow a filesystem to use any uids mapped into the user namespace, as the mounter of the filesystem. The user namespace root that is mounting the filesystem already has privileges to manipulate those users already. Which means the simple straight forward and understandable restriction to enforce is: - The user namespace of the opener of /dev/fuse is the same as the user namespace the fuse filesystem is mounted in is the same as the user namespace we translate any uids in and out of. By forbidding all uid s and gids not mapped into a user namespace from being used you can not manipulate processes you would not be able to manipulate before. As a nice side effect that happens to make having the setuid bit set something of a fuse filesystem something that is uninteresting. At that point the setuid bit being set is no more dangerous than the setuid being being set on any file you own. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html