On Mon, Feb 13, 2017 at 9:06 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote: > On Mon, Feb 13, 2017 at 08:07:16PM +0200, Amir Goldstein wrote: >> On Mon, Feb 13, 2017 at 6:18 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote: >> > Right now bprm_fill_uid() uses inode fetched from file_inode(bprm->file). >> > This in turn returns inode of lower filesystem (in a stacked filesystem >> > setup). >> > >> > I was playing with modified patches of shiftfs posted by james bottomley >> > and realized that through shiftfs setuid bit does not take effect. And >> > reason being that we fetch uid/gid from inode of lower fs (and not from >> > shiftfs inode). And that results in following checks failing. >> > >> > /* We ignore suid/sgid if there are no mappings for them in the ns */ >> > if (!kuid_has_mapping(bprm->cred->user_ns, uid) || >> > !kgid_has_mapping(bprm->cred->user_ns, gid)) >> > return; >> > >> > uid/gid fetched from lower fs inode might not be mapped inside the user >> > namespace of container. So we need to look at uid/gid fetched from >> > upper filesystem (shiftfs in this particular case) and these should be >> > mapped and setuid bit can take affect. >> >> This change is consistent with may_linkat() -> safe_hardlink_source() >> and with all instances of notify_change() in fs/open.c, but is not >> consistent with file_remove_privs() -> ... >> should_remove_suid()/notify_change() >> which also uses file_inode() > > I did a quick grep to figure out who is calling file_remove_privs(). Looks > like it is being called by underlying filesystem (ext4, xfs, ntfs, fuse). > If that's the case, we probably want to use the dentry and inode of > underlying filesystem and not recurse back to top level filesystem? > No I think you are right and there is no problem with file_remove_privs(). Not sure about do_coredump(), though, which uses file_inode(cprm.file). Other places in vfs that check inode->i_uid are even harder to follow. >> >> So perhaps remove of suid/sgid on truncate/write is also broken? > > vfs_truncate() seems to be using path->dentry so should be fine for the case > of truncate. > > do_truncate() -->dentry_need_remove_privs() > > I have taken v2 patches of shiftfs and modifed them to be read-only and > testing that. So not doing any write operations right now. > >> >> Do you know if we already have xfstests that cover these cases? > > No, I have not looked there yet. > >> >> > >> > Signed-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx> >> > --- >> > fs/exec.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > Index: rhvgoyal-linux/fs/exec.c >> > =================================================================== >> > --- rhvgoyal-linux.orig/fs/exec.c 2017-02-10 09:20:31.058642923 -0500 >> > +++ rhvgoyal-linux/fs/exec.c 2017-02-13 10:51:52.155751128 -0500 >> > @@ -1479,7 +1479,7 @@ static void bprm_fill_uid(struct linux_b >> > if (task_no_new_privs(current)) >> > return; >> > >> > - inode = file_inode(bprm->file); >> > + inode = d_backing_inode(bprm->file->f_path.dentry); >> >> I would much rather if we could use a macro for this pattern >> like locks_inode(), maybe privs_inode()? >> You could use the same macro if appropriate also in >> file_remove_privs(). >> >> and IMO the use of d_backing_inode() here doesn't make >> senses. I know it does nothing, but to my understanding its intention >> is quite the opposite of the change you are introducing >> (it was designed to give the underlying inode, not the union inode). > > Agreed that I also get confused by d_backing_inode(). > > So file can give information about two inodes. One stored in file->f_inode > which is accessed by file_inode() helper function. Can we create another > helper function to access that second inode. Say file_path_inode() and that > will return file->f_path.dentry->d_inode? > > Or I could keep it simple for now and drop d_backing_inode() and > access inode directly. > > inode = file->f_path.dentry->d_inode > Yes, that would be better. Using a macro file_path_inode(), IMO, does not add any value. Using a meaningful macro name (like locks_inode()), which is relevant to the context, can help understand where this macro should be used (i.e. instead of file_inode()). Defining those consistent semantics - I think that's the hard part.