On Mon, Feb 13, 2017 at 08:07:16PM +0200, Amir Goldstein wrote: > On Mon, Feb 13, 2017 at 6:18 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote: > > Right now bprm_fill_uid() uses inode fetched from file_inode(bprm->file). > > This in turn returns inode of lower filesystem (in a stacked filesystem > > setup). > > > > I was playing with modified patches of shiftfs posted by james bottomley > > and realized that through shiftfs setuid bit does not take effect. And > > reason being that we fetch uid/gid from inode of lower fs (and not from > > shiftfs inode). And that results in following checks failing. > > > > /* We ignore suid/sgid if there are no mappings for them in the ns */ > > if (!kuid_has_mapping(bprm->cred->user_ns, uid) || > > !kgid_has_mapping(bprm->cred->user_ns, gid)) > > return; > > > > uid/gid fetched from lower fs inode might not be mapped inside the user > > namespace of container. So we need to look at uid/gid fetched from > > upper filesystem (shiftfs in this particular case) and these should be > > mapped and setuid bit can take affect. > > This change is consistent with may_linkat() -> safe_hardlink_source() > and with all instances of notify_change() in fs/open.c, but is not > consistent with file_remove_privs() -> ... > should_remove_suid()/notify_change() > which also uses file_inode() I did a quick grep to figure out who is calling file_remove_privs(). Looks like it is being called by underlying filesystem (ext4, xfs, ntfs, fuse). If that's the case, we probably want to use the dentry and inode of underlying filesystem and not recurse back to top level filesystem? > > So perhaps remove of suid/sgid on truncate/write is also broken? vfs_truncate() seems to be using path->dentry so should be fine for the case of truncate. do_truncate() -->dentry_need_remove_privs() I have taken v2 patches of shiftfs and modifed them to be read-only and testing that. So not doing any write operations right now. > > Do you know if we already have xfstests that cover these cases? No, I have not looked there yet. > > > > > Signed-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx> > > --- > > fs/exec.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > Index: rhvgoyal-linux/fs/exec.c > > =================================================================== > > --- rhvgoyal-linux.orig/fs/exec.c 2017-02-10 09:20:31.058642923 -0500 > > +++ rhvgoyal-linux/fs/exec.c 2017-02-13 10:51:52.155751128 -0500 > > @@ -1479,7 +1479,7 @@ static void bprm_fill_uid(struct linux_b > > if (task_no_new_privs(current)) > > return; > > > > - inode = file_inode(bprm->file); > > + inode = d_backing_inode(bprm->file->f_path.dentry); > > I would much rather if we could use a macro for this pattern > like locks_inode(), maybe privs_inode()? > You could use the same macro if appropriate also in > file_remove_privs(). > > and IMO the use of d_backing_inode() here doesn't make > senses. I know it does nothing, but to my understanding its intention > is quite the opposite of the change you are introducing > (it was designed to give the underlying inode, not the union inode). Agreed that I also get confused by d_backing_inode(). So file can give information about two inodes. One stored in file->f_inode which is accessed by file_inode() helper function. Can we create another helper function to access that second inode. Say file_path_inode() and that will return file->f_path.dentry->d_inode? Or I could keep it simple for now and drop d_backing_inode() and access inode directly. inode = file->f_path.dentry->d_inode Vivek