On Fri, May 1, 2020 at 3:34 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > On Fri, May 01, 2020 at 05:14:44AM +0100, Al Viro wrote: > > On Thu, Apr 09, 2020 at 11:28:59PM +0200, Miklos Szeredi wrote: > > > From: Miklos Szeredi <mszeredi@xxxxxxxxxx> > > > > > > Whiteouts, unlike real device node should not require privileges to create. > > > > > > The general concern with device nodes is that opening them can have side > > > effects. The kernel already avoids zero major (see > > > Documentation/admin-guide/devices.txt). To be on the safe side the patch > > > explicitly forbids registering a char device with 0/0 number (see > > > cdev_add()). > > > > > > This guarantees that a non-O_PATH open on a whiteout will fail with ENODEV; > > > i.e. it won't have any side effect. > > > > > int vfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t dev) > > > { > > > + bool is_whiteout = S_ISCHR(mode) && dev == WHITEOUT_DEV; > > > int error = may_create(dir, dentry); > > > > > > if (error) > > > return error; > > > > > > - if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD)) > > > + if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD) && > > > + !is_whiteout) > > > return -EPERM; > > > > Hmm... That exposes vfs_whiteout() to LSM; are you sure that you won't > > end up with regressions for overlayfs on sufficiently weird setups? > > You're right. OTOH, what can we do? We can't fix the weird setups, only the > distros/admins can. > > Can we just try this, and revert to calling ->mknod directly from overlayfs if > it turns out to be a problem that people can't fix easily? > > I guess we could add a new ->whiteout security hook as well, but I'm not sure > it's worth it. Cc: LMS mailing list; patch re-added for context. I feel like I am still missing context but IIUC this change is allowing unprivileged userspace to explicitly call mknod(2) with the whiteout device number and skip all permission checks (except the LSM one). And then you are switching vfs_whiteout() over to using vfs_mknod() internally since it no longer does permission checking and that was why vfs_whiteout() was separate originally to avoid imposing any checks on overlayfs-internal creation of whiteouts? If that's correct, then it seems problematic since we have no way in the LSM hook to distinguish the two cases (userspace invocation of mknod(2) versus overlayfs-internal operation). Don't know offhand what credential is in effect in the overlayfs case (mounter or current) but regardless Android seems to use current regardless, and that could easily fail.