On Tue, 2017-02-07 at 10:10 -0800, Christoph Hellwig wrote: > On Tue, Feb 07, 2017 at 07:59:00PM +0200, Amir Goldstein wrote: > > I am not even sure that would be enough. > > dentry does not contain information about the mount user came from, > > and sb contains only information about the user ns of the mounter > > of > > the file system, not the mounter of the bind mount, right? > > I think I am missing some big pieces of the big picture. > > Would love to hear what Eric has to say. > > IFF we want to do what shiftfs does properly we need vfsmount + > inode, no need for the dentry. Yes, sorry ... I was thinking the dentry contained the mnt, but it doesn't, that's the path. However, threading the mnt through looks substantially harder. > But maybe we need to go back and decice if we want to allow uid/gid > remapping for arbitrary subtrees anyway. So those were the original patches Djalal was referring to. The problem there is that a lot of orchestration systems don't store images they want to bind mount into containers on separately mounted filesystems, which is what's needed to avoid this being per-subtree. However, the clinching argument for me is that the canonical container image *is* a subtree (unlike a vm image which has to be mounted). If we don't make this work on subtrees people go back to daft stacks for containers like copying the image subtree into a loopback mounted filesystem just to make this all work (and then complain about performance and caching and so on). > Another option would be to require something like a project as used > for project quotas as the root. This would also be conveniant as it > could storge the used remapping tables. So this would be like the current project quota except set on a subtree? I could see it being done that way but I don't see what advantage it has over using flags in the subtree itself (the mapping is known based on the mount namespace, so there's really only a single bit of information to store). James