On Wed, 2021-09-29 at 18:09 +0800, Ian Kent wrote: > On Tue, 2021-09-28 at 22:07 +0800, Hou Tao wrote: > > A KMSAN warning is reported by Alexander Potapenko: > > I haven't looked at this thoroughly yet I admit but ... > > I'm very sorry but again I don't quite agree with what's done > in the patch. > > > > > BUG: KMSAN: uninit-value in kernfs_dop_revalidate+0x61f/0x840 > > fs/kernfs/dir.c:1053 > > kernfs_dop_revalidate+0x61f/0x840 fs/kernfs/dir.c:1053 > > d_revalidate fs/namei.c:854 > > lookup_dcache fs/namei.c:1522 > > __lookup_hash+0x3a6/0x590 fs/namei.c:1543 > > filename_create+0x312/0x7c0 fs/namei.c:3657 > > do_mkdirat+0x103/0x930 fs/namei.c:3900 > > __do_sys_mkdir fs/namei.c:3931 > > __se_sys_mkdir fs/namei.c:3929 > > __x64_sys_mkdir+0xda/0x120 fs/namei.c:3929 > > do_syscall_x64 arch/x86/entry/common.c:51 > > > > It seems a positive dentry in kernfs becomes a negative dentry > > directly > > through d_delete() in vfs_rmdir(). dentry->d_time is uninitialized > > when accessing it in kernfs_dop_revalidate(), because it is only > > initialized when created as negative dentry in kernfs_iop_lookup(). > > It looks more like the final dput() that's sending the dentry to > the LRU list than the call to d_delete(). > > But I could be mistaken and it's just a detail. > > I think the real question is do we want kernfs ->rmdir() to utilize > the VFS negative dentry facility. > > Clearly, based in the kernfs ->rmdir() op function definition kernfs > doesn't want file systems that use it to be able to decide that. > > Unless there is a case for allowing dentries to be re-used I think > they should be dropped in kernfs_iop_rmdir(). > > > > > The problem can be reproduced by the following command: > > > > cd /sys/fs/cgroup/pids && mkdir hi && stat hi && rmdir hi && stat > > hi > > And immediately recreating the directory is probably not a common > use case so it doesn't provide a case for kernfs ->rmdir() not > dropping the dentry to prevent the dentry making it to the LRU > negative dentry list. > > The more likely case is an unnecessary accumulation of negative > dentries from file system activity with occasional negative dentry > re-use which might be undesirable for long running systems. > > > > > A simple fixes seems to be initializing d->d_time for positive > > dentry > > in kernfs_iop_lookup() as well. The downside is the negative dentry > > will be revalidated again after it becomes negative in d_delete(), > > because the revison of its parent must have been increased due to > > its removal. > > I did that for a long time is the series before deciding to do > it only for negatives ... > > The usual case of directory removal is the dentry is dropped by > the ->rmdir() function either directly or via d_invalidate() to > prevent possible incorrect initialization on re-create. > > But this is probably unlikely for kernfs attributes of the same > name so I'm not certain dropping the dentry is in fact what really > should be done here. > > I admit I did miss this case. > > > > > Alternative solution is implement .d_iput for kernfs, and assign > > d_time > > for the newly-generated negative dentry in it. But we may need to > > take kernfs_rwsem to protect again the concurrent > > kernfs_link_sibling() > > on the parent directory, it is a little over-killing. Now the > > simple > > fix is chosen. > > Please don't even consider this, it's misleading because here we > are concerned with the dentry not the inode so I don't think adding > ->d_iput() is the right thing to do. > > > > > Link: https://marc.info/?l=linux-fsdevel&m=163249838610499 > > Fixes: c7e7c04274b1 ("kernfs: use VFS negative dentry caching") > > Reported-by: Alexander Potapenko <glider@xxxxxxxxxx> > > Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> > > --- > > fs/kernfs/dir.c | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c > > index 0c7f1558f489..f7618ba5b3b2 100644 > > --- a/fs/kernfs/dir.c > > +++ b/fs/kernfs/dir.c > > @@ -1140,8 +1140,13 @@ static struct dentry > > *kernfs_iop_lookup(struct > > inode *dir, > > if (!inode) > > inode = ERR_PTR(-ENOMEM); > > } > > - /* Needed only for negative dentry validation */ > > - if (!inode) > > + /* > > + * Needed for negative dentry validation. > > + * The negative dentry can be created in > > kernfs_iop_lookup() > > + * or transforms from positive dentry in > > dentry_unlink_inode() > > + * called from vfs_rmdir(). > > + */ > > + if (!IS_ERR(inode)) > > kernfs_set_rev(parent, dentry); > > up_read(&kernfs_rwsem); > > > > This is not a bad idea to do and is what I did for quite a while too > but maybe it would be better to do this: I see Greg has pushed the above patch to Linus. After thinking about it this is a good call, after all the reported bug was about a lack of initialization and that's specifically what the patch addresses. Ian > > diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c > index a957c944cf3a..3fd9d8fa4b92 100644 > --- a/fs/kernfs/dir.c > +++ b/fs/kernfs/dir.c > @@ -1165,6 +1165,8 @@ static int kernfs_iop_rmdir(struct inode *dir, > struct dentry *dentry) > return -ENODEV; > > ret = scops->rmdir(kn); > + if (!ret) > + d_invalidate(dentry); > > kernfs_put_active(kn); > return ret; >