On Wed, 2017-02-15 at 09:17 -0500, Vivek Goyal wrote: > On Tue, Feb 14, 2017 at 03:45:55PM -0800, James Bottomley wrote: > > On Tue, 2017-02-14 at 18:03 -0500, Vivek Goyal wrote: [...] > > > Given we have already shifted the uid/gid for shiftfs inode, I am > > > wondering that why can't we simply call > > > generic_permission(shiftfs_inode, mask) directly in the context > > > of caller. Something like.. > > > > > > shiftfs_permission() { > > > err = generic_permission(inode, mask); > > > if (err) > > > return err; > > > > > > switch_to_mounter_creds; > > > err = inode_permission(reali, mask); > > > revert_creds(); > > > > > > return err; > > > } > > > > Because if the reali->d_iop->permission exists, you should use it. > > You could argue shiftfs_permission should be > > > > if (iop->permission) { > > oldcred = shiftfs_new_creds(&newcred, inode->i_sb); > > err = iop->permission(reali, mask); > > shiftfs_old_creds(oldcred, &newcred); > > } else > > err = generic_permission(inode, mask); > > > > But really that's a small optimisation. > > ok. I thought using mounter's creds for real inode checks, will > probably do away with need of modifying caller's user namespace in > shiftfs_get_up_creds(). Well, no ... the mounter of a marked superblock is container admin, but the owner in the filesystem view is real root. The unprivileged mounter's credentials aren't sufficient, therefore. > cred->fsuid = KUIDT_INIT(from_kuid(sb->s_user_ns, cred->fsuid)); > cred->fsgid = KGIDT_INIT(from_kgid(sb->s_user_ns, cred->fsgid)); > cred->user_ns = ssi->userns; > > IIUC, we are shifting caller's fsuid and fsgid into caller's user > namespace but at the same time using the user_ns of reali->sb > ->sb_user_ns. Feels little twisted to me. May be I am > misunderstanding it. Actually what we're doing is shifting the credentials into the underlying mount's filesystem view. > Two levels of checks will simplify this a bit. Top level inode will > belong to the user namespace of caller and checks should pass. And > mounter's creds will have ownership over the real inode so no > additional namespace shifting required there. That's the problem: for a marked mount, they don't. > We could also save these creds at mount time once and don't have to > prepare it for every call. (not sure if it has significant > performance issue or not). Just thinking aloud... If it proves to be an issue, I suppose the shift could be cached, but I really don't think it matters that much. James