On Thu, 2021-12-02 at 21:31 -0500, Stefan Berger wrote: [...] > static int securityfs_init_fs_context(struct fs_context *fc) > { > + int rc; > + > + if (fc->user_ns->ima_ns->late_fs_init) { > + rc = fc->user_ns->ima_ns->late_fs_init(fc->user_ns); > + if (rc) > + return rc; > + } > fc->ops = &securityfs_context_ops; > return 0; > } I know I suggested this, but to get this to work in general, it's going to have to not be specific to IMA, so it's going to have to become something generic like a notifier chain. The other problem is it's only working still by accident: > +int ima_fs_ns_init(struct ima_namespace *ns) > +{ > + ns->mount = securityfs_ns_create_mount(ns->user_ns); This actually triggers on the call to securityfs_init_fs_context, but nothing happens because the callback is null. Every subsequent use of fscontext will trigger this. The point of a keyed supeblock is that fill_super is only called once per key, that's the place we should be doing this. It should also probably be a blocking notifier so any consumer of securityfs can be namespaced by registering for this notifier. > + if (IS_ERR(ns->mount)) { > + ns->mount = NULL; > + return -1; > + } > + ns->mount_count = 1; This is a bit nasty, too: we're spilling the guts of mount count tracking into IMA instead of encapsulating it inside securityfs. > + > + /* Adjust the trigger for user namespace's early teardown of > dependent > + * namespaces. Due to the filesystem there's an additional > reference > + * to the user namespace. > + */ > + ns->user_ns->refcount_teardown += 1; > + > + ns->late_fs_init = ima_fs_ns_late_init; > + > + return 0; > +} I think what should be happening is that we shouldn't so the simple_pin_fs, which creates the inodes, ahead of time; we should do it inside fill_super using a notifier, meaning it gets called once per key, creates the root dentry then triggers the notifier which instantiates all the namespaced entries. We can still use simple_pin_fs for this because there's no locking across fill_super. This would mean fill_super would be called the first time the securityfs is mounted inside the namespace. If we do it this way, we can now make securityfs have its own mount and mount_count inside the user namespace, which it uses internally to the securityfs code, thus avoiding exposing them to ima or any other namespaced consumer. I also think we now don't need the securityfs_ns_ duplicated functions because the callback via the notifier chain now ensures we can use the namespace they were created in to distinguish between non namespaced and namespaced entries. So non-namespaced consumers of securityfs would do what they do now (calling the securityfs_create on initialization) and namespaced consumers would register a callback on the notifier which would get called once for every namespace the securityfs gets mounted in. I also theorize if we do it with notifiers, we could have a notifier on kill_sb to tear down all the entires. If we do this, I think we don't have to pin any more. James