On Wed, 2022-01-05 at 11:18 +0100, Christian Brauner wrote: > On Wed, Jan 05, 2022 at 03:58:11AM +0000, Al Viro wrote: > > On Tue, Jan 04, 2022 at 12:03:58PM -0500, Stefan Berger wrote: > > > From: Stefan Berger <stefanb@xxxxxxxxxxxxx> > > > Drop the additional dentry reference to enable simple cleanup of dentries > > > upon umount. Now the dentries do not need to be explicitly freed anymore > > > but we can just rely on d_genocide() and the dcache shrinker to do all > > > the required work. > > The "additional dentry reference" mentioned only relates to an afaict > unnecessary dget() in securityfs_create_dentry() which I pointed out > as part of earlier reviews. But the phrasing implies that there's a > behavioral change for the initial securityfs instance based on the > removal of this additional dget() when there really isn't. > > After securityfs_create_dentry() has created a new dentry via > lookup_one_len() and eventually called d_instantiate() it currently > takes an additional reference on the newly created dentry via dget(). > This additional reference is then paired with an additional dput() in > securityfs_remove(). I have not yet seen a reason why this is > necessary maybe you can help there. > > For example, contrast this with debugfs which has the same underlying > logic as securityfs, i.e. any created dentry pins the whole filesystem > via simple_pin_fs() until the dentry is released and simple_unpin_fs() > is called. It uses a similar pairing as securityfs: where securityfs > has the securityfs_create_dentry() and securityfs_remove() pairing, > debugfs has the __debugfs_create_file() and debugfs_remove() pairing. > But debugfs doesn't take an additional reference on the just created > dentry in __debugfs_create_file() which would need to be put in > debugfs_remove(). > > So if we contrast the creation routines of securityfs and debugfs directly > condensed to just the dentry references: > > securityfs | debugfs > ---------------- | ------------------ > | > lookup_one_len() | lookup_one_len() > d_instantiate() | d_instantiate() > dget() | > > And I have not understood why securityfs would need that additional > dget(). Not just intrinsically but also when contrasted with debugfs. So > that additional dget() is removed as part of this patch. Assuming it isn't needed, could removing it be a separate patch and upstreamed independently of either the securityfs or IMA namespacing changes? thanks, Mimi > > But the explanation in the commit message isn't ideal as it implies > the removal of the additional dget() would have any impact on the > pinning logic for securityfs when it does not. > > But the pinning logic doesn't make sense outside of the initial > namespace which can never go away and there are security modules that > have files or settings for the whole system that never go away and will > always keep the filesystem around. > > But for unprivileged/userns containers that mount their own securityfs > instance we want the securityfs instance cleaned up when it is > unmounted. There is no need to duplicate the pinning logic or make the > global securityfs instance display different information based on the > userns. Both options would be really messy and hacky. > > Instead we can simply give each userns it's own securityfs instance > similar to how each ipc ns has its own mqueue instance and all entries > in there are cleaned up on umount or when the whole container is > shutdown. After the container is shutdown all of the security module > settings for the container go away with it anyway. So for that we don't > want any filesystem pinning done in securityfs_create_dentry(). And we > also really don't want the additional dget() that is currently taken in > securityfs_create_dentry() as it would pointlessly require us to dput() > during superblock shutdown afaict. None of this however should cause any > behavioral changes for the initial securityfs instance.