Re: [PATCH] kernfs: also call kernfs_set_rev() for positive dentry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2021-09-28 at 22:07 +0800, Hou Tao wrote:
> A KMSAN warning is reported by Alexander Potapenko:

I haven't looked at this thoroughly yet I admit but ...

I'm very sorry but again I don't quite agree with what's done
in the patch.

> 
> BUG: KMSAN: uninit-value in kernfs_dop_revalidate+0x61f/0x840
> fs/kernfs/dir.c:1053
>  kernfs_dop_revalidate+0x61f/0x840 fs/kernfs/dir.c:1053
>  d_revalidate fs/namei.c:854
>  lookup_dcache fs/namei.c:1522
>  __lookup_hash+0x3a6/0x590 fs/namei.c:1543
>  filename_create+0x312/0x7c0 fs/namei.c:3657
>  do_mkdirat+0x103/0x930 fs/namei.c:3900
>  __do_sys_mkdir fs/namei.c:3931
>  __se_sys_mkdir fs/namei.c:3929
>  __x64_sys_mkdir+0xda/0x120 fs/namei.c:3929
>  do_syscall_x64 arch/x86/entry/common.c:51
> 
> It seems a positive dentry in kernfs becomes a negative dentry directly
> through d_delete() in vfs_rmdir(). dentry->d_time is uninitialized
> when accessing it in kernfs_dop_revalidate(), because it is only
> initialized when created as negative dentry in kernfs_iop_lookup().

It looks more like the final dput() that's sending the dentry to
the LRU list than the call to d_delete().

But I could be mistaken and it's just a detail.

I think the real question is do we want kernfs ->rmdir() to utilize
the VFS negative dentry facility.

Clearly, based in the kernfs ->rmdir() op function definition kernfs
doesn't want file systems that use it to be able to decide that.

Unless there is a case for allowing dentries to be re-used I think
they should be dropped in kernfs_iop_rmdir().

> 
> The problem can be reproduced by the following command:
> 
>   cd /sys/fs/cgroup/pids && mkdir hi && stat hi && rmdir hi && stat hi

And immediately recreating the directory is probably not a common
use case so it doesn't provide a case for kernfs ->rmdir() not
dropping the dentry to prevent the dentry making it to the LRU
negative dentry list.

The more likely case is an unnecessary accumulation of negative
dentries from file system activity with occasional negative dentry
re-use which might be undesirable for long running systems.

> 
> A simple fixes seems to be initializing d->d_time for positive dentry
> in kernfs_iop_lookup() as well. The downside is the negative dentry
> will be revalidated again after it becomes negative in d_delete(),
> because the revison of its parent must have been increased due to
> its removal.

I did that for a long time is the series before deciding to do
it only for negatives ...

The usual case of directory removal is the dentry is dropped by
the ->rmdir() function either directly or via d_invalidate() to
prevent possible incorrect initialization on re-create.

But this is probably unlikely for kernfs attributes of the same
name so I'm not certain dropping the dentry is in fact what really
should be done here.

I admit I did miss this case.

> 
> Alternative solution is implement .d_iput for kernfs, and assign d_time
> for the newly-generated negative dentry in it. But we may need to
> take kernfs_rwsem to protect again the concurrent kernfs_link_sibling()
> on the parent directory, it is a little over-killing. Now the simple
> fix is chosen.

Please don't even consider this, it's misleading because here we
are concerned with the dentry not the inode so I don't think adding
->d_iput() is the right thing to do.

> 
> Link: https://marc.info/?l=linux-fsdevel&m=163249838610499
> Fixes: c7e7c04274b1 ("kernfs: use VFS negative dentry caching")
> Reported-by: Alexander Potapenko <glider@xxxxxxxxxx>
> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
> ---
>  fs/kernfs/dir.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
> index 0c7f1558f489..f7618ba5b3b2 100644
> --- a/fs/kernfs/dir.c
> +++ b/fs/kernfs/dir.c
> @@ -1140,8 +1140,13 @@ static struct dentry *kernfs_iop_lookup(struct
> inode *dir,
>                 if (!inode)
>                         inode = ERR_PTR(-ENOMEM);
>         }
> -       /* Needed only for negative dentry validation */
> -       if (!inode)
> +       /*
> +        * Needed for negative dentry validation.
> +        * The negative dentry can be created in kernfs_iop_lookup()
> +        * or transforms from positive dentry in dentry_unlink_inode()
> +        * called from vfs_rmdir().
> +        */
> +       if (!IS_ERR(inode))
>                 kernfs_set_rev(parent, dentry);
>         up_read(&kernfs_rwsem);
>  

This is not a bad idea to do and is what I did for quite a while too
but maybe it would be better to do this:

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index a957c944cf3a..3fd9d8fa4b92 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1165,6 +1165,8 @@ static int kernfs_iop_rmdir(struct inode *dir, struct dentry *dentry)
                return -ENODEV;
 
        ret = scops->rmdir(kn);
+       if (!ret)
+               d_invalidate(dentry);
 
        kernfs_put_active(kn);
        return ret;





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux