Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Thu, 16 Feb 2017 07:51:58 -0800

On Wed, 2017-02-15 at 09:17 -0500, Vivek Goyal wrote:
> On Tue, Feb 14, 2017 at 03:45:55PM -0800, James Bottomley wrote:
> > On Tue, 2017-02-14 at 18:03 -0500, Vivek Goyal wrote:

[...]
> > > Given we have already shifted the uid/gid for shiftfs inode, I am
> > > wondering that why can't we simply call
> > > generic_permission(shiftfs_inode, mask) directly in the context
> > > of caller. Something like..
> > > 
> > > shiftfs_permission() {
> > >   err = generic_permission(inode, mask);
> > >   if (err)
> > >   	return err;
> > > 
> > >   switch_to_mounter_creds;
> > >   err = inode_permission(reali, mask);
> > >   revert_creds();
> > > 
> > >   return err;
> > > }
> > 
> > Because if the reali->d_iop->permission exists, you should use it. 
> >  You could argue shiftfs_permission should be
> > 
> > 	if (iop->permission) {
> > 		oldcred = shiftfs_new_creds(&newcred, inode->i_sb);
> > 		err = iop->permission(reali, mask);
> > 		shiftfs_old_creds(oldcred, &newcred);
> > 	} else
> > 		err = generic_permission(inode, mask);
> > 
> > But really that's a small optimisation.
> 
> ok. I thought using mounter's creds for real inode checks, will 
> probably do away with need of modifying caller's user namespace in
> shiftfs_get_up_creds().

Well, no ... the mounter of a marked superblock is container admin, but
the owner in the filesystem view is real root.  The unprivileged
mounter's credentials aren't sufficient, therefore. 

> cred->fsuid = KUIDT_INIT(from_kuid(sb->s_user_ns, cred->fsuid));
> cred->fsgid = KGIDT_INIT(from_kgid(sb->s_user_ns, cred->fsgid));
> cred->user_ns = ssi->userns;
> 
> IIUC, we are shifting caller's fsuid and fsgid into caller's user
> namespace but at the same time using the user_ns of reali->sb
> ->sb_user_ns. Feels little twisted to me. May be I am
> misunderstanding it.

Actually what we're doing is shifting the credentials into the
underlying mount's filesystem view.

> Two levels of checks will simplify this a bit. Top level inode will 
> belong to the user namespace of caller and checks should pass. And 
> mounter's creds will have ownership over the real inode so no 
> additional namespace shifting required there.

That's the problem: for a marked mount, they don't.

>  We could also save these creds at mount time once and don't have to
> prepare it for every call. (not sure if it has significant
> performance issue or not).  Just thinking aloud...

If it proves to be an issue, I suppose the shift could be cached, but I
really don't think it matters that much.

James