On Thu, Dec 16, 2021 at 02:40:27PM +0100, Christian Brauner wrote: > On Thu, Dec 16, 2021 at 12:43:19AM -0500, Stefan Berger wrote: > > From: Stefan Berger <stefanb@xxxxxxxxxxxxx> > > > > Extend 'securityfs' for support of IMA namespacing so that each > > IMA (user) namespace can have its own front-end for showing the currently > > active policy, the measurement list, number of violations and so on. > > > > Drop the addition dentry reference to enable simple cleanup of dentries > > upon umount. > > > > Prevent mounting of an instance of securityfs in another user namespace > > than it belongs to. Also, prevent accesses to directories when another > > user namespace is active than the one that the instance of securityfs > > belongs to. > > > > Signed-off-by: Stefan Berger <stefanb@xxxxxxxxxxxxx> > > Signed-off-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > > --- > > security/inode.c | 37 ++++++++++++++++++++++++++++++++++--- > > 1 file changed, 34 insertions(+), 3 deletions(-) > > > > diff --git a/security/inode.c b/security/inode.c > > index fee01ff4d831..a0d9f086e3d5 100644 > > --- a/security/inode.c > > +++ b/security/inode.c > > @@ -26,6 +26,29 @@ > > static struct vfsmount *init_securityfs_mount; > > static int init_securityfs_mount_count; > > > > +static int securityfs_permission(struct user_namespace *mnt_userns, > > + struct inode *inode, int mask) > > +{ > > + int err; > > + > > + err = generic_permission(&init_user_ns, inode, mask); > > + if (!err) { > > + if (inode->i_sb->s_user_ns != current_user_ns()) > > + err = -EACCES; > > I really think the correct semantics is to grant all callers access > whose user namespace is the same as or an ancestor of the securityfs > userns. It's weird to deny access to callers who are located in an > ancestor userns. > > For example, a privileged process on the host should be allowed to setns > to the userns of an unprivileged container and inspect its securityfs s/userns/mntns/ > instance. > > We're mostly interested to block such as scenarios where two sibling > unprivileged containers are created in the initial userns and an fd > proxy or something funnels a file descriptor from one sibling container > to the another one and the receiving sibling container can use readdir() > or openat() on this fd. (I'm not even convinced that this is actually a > problem but stricter semantics at the beginning can't hurt. We can > always relax this later.) >