Re: [PATCH] vfs: Use upper filesystem inode in bprm_fill_uid()

Amir Goldstein <amir73il@xxxxxxxxx> · Mon, 13 Feb 2017 22:01:09 +0200

On Mon, Feb 13, 2017 at 9:06 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Mon, Feb 13, 2017 at 08:07:16PM +0200, Amir Goldstein wrote:
>> On Mon, Feb 13, 2017 at 6:18 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> > Right now bprm_fill_uid() uses inode fetched from file_inode(bprm->file).
>> > This in turn returns inode of lower filesystem (in a stacked filesystem
>> > setup).
>> >
>> > I was playing with modified patches of shiftfs posted by james bottomley
>> > and realized that through shiftfs setuid bit does not take effect. And
>> > reason being that we fetch uid/gid from inode of lower fs (and not from
>> > shiftfs inode). And that results in following checks failing.
>> >
>> > /* We ignore suid/sgid if there are no mappings for them in the ns */
>> > if (!kuid_has_mapping(bprm->cred->user_ns, uid) ||
>> >     !kgid_has_mapping(bprm->cred->user_ns, gid))
>> >         return;
>> >
>> > uid/gid fetched from lower fs inode might not be mapped inside the user
>> > namespace of container. So we need to look at uid/gid fetched from
>> > upper filesystem (shiftfs in this particular case) and these should be
>> > mapped and setuid bit can take affect.
>>
>> This change is consistent with may_linkat() -> safe_hardlink_source()
>> and with all instances of notify_change() in fs/open.c, but is not
>> consistent with file_remove_privs() -> ...
>> should_remove_suid()/notify_change()
>> which also uses file_inode()
>
> I did a quick grep to figure out who is calling file_remove_privs(). Looks
> like it is being called by underlying filesystem (ext4, xfs, ntfs, fuse).
> If that's the case, we probably want to use the dentry and inode of
> underlying filesystem and not recurse back to top level filesystem?
>

No I think you are right and there is no problem with file_remove_privs().

Not sure about do_coredump(), though, which uses file_inode(cprm.file).
Other places in vfs that check inode->i_uid are even harder to follow.

>>
>> So perhaps remove of suid/sgid on truncate/write is also broken?
>
> vfs_truncate() seems to be using path->dentry so should be fine for the case
> of truncate.
>
> do_truncate() -->dentry_need_remove_privs()
>
> I have taken v2 patches of shiftfs and modifed them to be read-only and
> testing that. So not doing any write operations right now.
>
>>
>> Do you know if we already have xfstests that cover these cases?
>
> No, I have not looked there yet.
>
>>
>> >
>> > Signed-off-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
>> > ---
>> >  fs/exec.c |    2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > Index: rhvgoyal-linux/fs/exec.c
>> > ===================================================================
>> > --- rhvgoyal-linux.orig/fs/exec.c       2017-02-10 09:20:31.058642923 -0500
>> > +++ rhvgoyal-linux/fs/exec.c    2017-02-13 10:51:52.155751128 -0500
>> > @@ -1479,7 +1479,7 @@ static void bprm_fill_uid(struct linux_b
>> >         if (task_no_new_privs(current))
>> >                 return;
>> >
>> > -       inode = file_inode(bprm->file);
>> > +       inode = d_backing_inode(bprm->file->f_path.dentry);
>>
>> I would much rather if we could use a macro for this pattern
>> like locks_inode(), maybe privs_inode()?
>> You could use the same macro if appropriate also in
>> file_remove_privs().
>>
>> and IMO the use of d_backing_inode() here doesn't make
>> senses. I know it does nothing, but to my understanding its intention
>> is quite the opposite of the change you are introducing
>> (it was designed to give the underlying inode, not the union inode).
>
> Agreed that I also get confused by d_backing_inode().
>
> So file can give information about two inodes. One stored in file->f_inode
> which is accessed by file_inode() helper function. Can we create another
> helper function to access that second inode. Say file_path_inode() and that
> will return file->f_path.dentry->d_inode?
>
> Or I could keep it simple for now and drop d_backing_inode() and
> access inode directly.
>
> inode = file->f_path.dentry->d_inode
>

Yes, that would be better.
Using a macro file_path_inode(), IMO, does not add any value.
Using a meaningful macro name (like locks_inode()), which is relevant
to the context, can help understand where this macro should be used
(i.e. instead of file_inode()).
Defining those consistent semantics - I think that's the hard part.