On Wed, Feb 8, 2017 at 1:42 AM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 2017-02-07 at 14:25 -0800, Christoph Hellwig wrote: >> On Tue, Feb 07, 2017 at 11:01:29PM +0200, Amir Goldstein wrote: >> > Project id's are not exactly "subtree" semantic, but inheritance >> > semantics, >> > which is not the same when non empty directories get their project >> > id changed. >> > Here is a recap: >> > https://lwn.net/Articles/623835/ >> >> Yes - but if we abuse them for containers we could refine the >> semantics to simply not allow change of project ids from inside >> containers based on say capabilities. > You mean something like this: https://lwn.net/Articles/632917/ With the suggested protected_projects, projid 0 (also inside container) gets a special meaning, much like user 0, so we may do interesting things with the projid that is mapped to 0. > We can't really abuse projectid, it's part of the user namespace > mapping (for project quota). What we can do is have a new id that > behaves like it. > Perhaps we *can* use projid without abusing it. userns already maps projids, but there is no concept of "owning project" for a userns, nor does it make a lot of sense, because projid is not part of the credentials. But if we re-brand it as "container root projid", we can try to use it for defining semantics to grant unprivileged access to a subtree. The functionality you are trying to get with shiftfs mark does sounds a bit like "container root projid": - inodes with mapped projid MAY be uid/gid shifted - inodes with unmapped projid MAY NOT I realize this may be very raw, but its a start. If you like this direction we can try to develop it. > But like I said, we don't really need a ful ID, it would basically just > be a single bit mark to say remap or not when doing permission checks > against this inode. It would follow some of the project id semantics > (like inheritance from parent dir) > But a single bit would only work for single level of userns nesting won't it? >> > I guess we should define the semantics for the required sub-tree >> > marking, before we can talk about solutions. >> >> Good plan. > > So I've been thinking about how to do this without subtree marking and > yet retain the subtree properties similar to project id. The advantage > would be that if it can be done using only inode properties, then none > of the permission prototypes need change. The only real subtree > property we need is ability to bind into an unprivileged mount > namespace, but we already have that. The gotcha about marking inodes > is that they're all or nothing, so every subtree that gets access to > the inode inherits the mark. This means that we cannot allow a user > access to a marked inode without the cover of an unprivileged user > namespace, but I think that's fixable in the permission check > (basically if the inode is marked you *only* get access if you have a > user_ns != init_user_ns and we do the permission shifts or you have > user_ns == init_user_ns and you are admin capable). > I didn't follow, but it sounds like your proposed solutions is only good for single level of userns nesting. Do you think you can redefine it in terms of "container root projid".