Re: tmpfs hang after v6.12-rc6

Jeongjun Park <aha310510@xxxxxxxxx> · Sat, 16 Nov 2024 01:58:17 +0900

Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>
> On Sat, Nov 16, 2024 at 01:33:19AM +0900, Jeongjun Park wrote:
> > 2024년 11월 16일 (토) 오전 1:25, Chuck Lever <chuck.lever@xxxxxxxxxx>님이 작성:
> > >
> > > On Fri, Nov 15, 2024 at 11:04:56AM -0500, Chuck Lever wrote:
> > > > I've found that NFS access to an exported tmpfs file system hangs
> > > > indefinitely when the client first performs a GETATTR. The hanging
> > > > nfsd thread is waiting for the inode lock in shmem_getattr():
> > > >
> > > > task:nfsd            state:D stack:0     pid:1775  tgid:1775  ppid:2      flags:0x00004000
> > > > Call Trace:
> > > >  <TASK>
> > > >  __schedule+0x770/0x7b0
> > > >  schedule+0x33/0x50
> > > >  schedule_preempt_disabled+0x19/0x30
> > > >  rwsem_down_read_slowpath+0x206/0x230
> > > >  down_read+0x3f/0x60
> > > >  shmem_getattr+0x84/0xf0
> > > >  vfs_getattr_nosec+0x9e/0xc0
> > > >  vfs_getattr+0x49/0x50
> > > >  fh_getattr+0x43/0x50 [nfsd]
> > > >  fh_fill_pre_attrs+0x4e/0xd0 [nfsd]
> > > >  nfsd4_open+0x51f/0x910 [nfsd]
> > > >  nfsd4_proc_compound+0x492/0x5d0 [nfsd]
> > > >  nfsd_dispatch+0x117/0x1f0 [nfsd]
> > > >  svc_process_common+0x3b2/0x5e0 [sunrpc]
> > > >  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
> > > >  svc_process+0xcf/0x130 [sunrpc]
> > > >  svc_recv+0x64e/0x750 [sunrpc]
> > > >  ? __wake_up_bit+0x4b/0x60
> > > >  ? __pfx_nfsd+0x10/0x10 [nfsd]
> > > >  nfsd+0xc6/0xf0 [nfsd]
> > > >  kthread+0xed/0x100
> > > >  ? __pfx_kthread+0x10/0x10
> > > >  ret_from_fork+0x2e/0x50
> > > >  ? __pfx_kthread+0x10/0x10
> > > >  ret_from_fork_asm+0x1a/0x30
> > > >  </TASK>
> > > >
> > > > I bisected the problem to:
> > > >
> > > > d949d1d14fa281ace388b1de978e8f2cd52875cf is the first bad commit
> > > > commit d949d1d14fa281ace388b1de978e8f2cd52875cf
> > > > Author:     Jeongjun Park <aha310510@xxxxxxxxx>
> > > > AuthorDate: Mon Sep 9 21:35:58 2024 +0900
> > > > Commit:     Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > > CommitDate: Mon Oct 28 21:40:39 2024 -0700
> > > >
> > > >     mm: shmem: fix data-race in shmem_getattr()
> > > >
> > > > ...
> > > >
> > > >     Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@xxxxxxxxx
> > > >     Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
> > > >     Signed-off-by: Jeongjun Park <aha310510@xxxxxxxxx>
> > > >     Reported-by: syzbot <syzkaller@xxxxxxxxxxxxxxx>
> > > >     Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > > >     Cc: Yu Zhao <yuzhao@xxxxxxxxxx>
> > > >     Cc: <stable@xxxxxxxxxxxxxxx>
> > > >     Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > >
> > > > which first appeared in v6.12-rc6, and adds the line that is waiting
> > > > on the inode lock when my NFS server hangs.
> > > >
> > > > I haven't yet found the process that is holding the inode lock for
> > > > this inode.
> > >
> > > It is likely that the caller (nfsd4_open()-> fh_fill_pre_attrs()) is
> > > already holding the inode semaphore in this case.
> >
> > Thanks for letting me know!
> >
> > It seems that the previous patch I wrote was wrong in how to prevent data-race.
> > It seems that the problem occurs in nfsd because nfsd4_create_file() already
> > holds the inode_lock.
> >
> > After further analysis, I found that this data-race mainly occurs when
> > vfs_statx_path does not acquire the inode_lock, and in other filesystems,
> > it is confirmed that inode_lock is acquired in many cases, so I will send a
> > new patch that fixes this problem right away.
>
> Thanks for your quick response!
>
> My brief sample of file system ->getattr methods shows that these
> functions do not grab the inode semaphore at all when calling
> generic_fillattr(). Likely they expect the method's caller to take
> it.
>
> I strongly prefer to see this commit reverted for v6.12-rc first,
> and then the new fix should be merged via a normal merge window to
> permit a lengthy period of testing.
>

Hmm... Of course, revert this patch is not a bad idea, but I think that
patching it like below can effectively prevent data-race without causing
recursive locking:

---
 mm/shmem.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index e87f5d6799a7..d061f8b34d49 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1153,6 +1153,12 @@ static int shmem_getattr(struct mnt_idmap *idmap,
 {
    struct inode *inode = path->dentry->d_inode;
    struct shmem_inode_info *info = SHMEM_I(inode);
+   bool inode_locked = NULL;
+
+   if (!inode_is_locked(inode)) {
+       inode_lock_shared(inode);
+       inode_locked = true;
+   }

    if (info->alloced - info->swapped != inode->i_mapping->nrpages)
        shmem_recalc_inode(inode, 0, 0);
@@ -1166,9 +1172,7 @@ static int shmem_getattr(struct mnt_idmap *idmap,
    stat->attributes_mask |= (STATX_ATTR_APPEND |
            STATX_ATTR_IMMUTABLE |
            STATX_ATTR_NODUMP);
-   inode_lock_shared(inode);
    generic_fillattr(idmap, request_mask, inode, stat);
-   inode_unlock_shared(inode);

    if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0))
        stat->blksize = HPAGE_PMD_SIZE;
@@ -1179,6 +1183,9 @@ static int shmem_getattr(struct mnt_idmap *idmap,
        stat->btime.tv_nsec = info->i_crtime.tv_nsec;
    }

+   if (inode_locked)
+       inode_unlock_shared(inode);
+
    return 0;
 }

--

What do you think?

Regards,

Jeongjun Park

>
> > > > Because this commit addresses only a KCSAN splat that has been
> > > > present since v4.3, and does not address a reported behavioral
> > > > issue, I respectfully request that this commit be reverted
> > > > immediately so that it does not appear in v6.12 final.
> > > > Troubleshooting and testing should continue until a fix to the KCSAN
> > > > issue can be found that does not deadlock NFS exports of tmpfs.
> > > >
> > > >
> > > > --
> > > > Chuck Lever
> > > >
> > >
> > > --
> > > Chuck Lever
>
> --
> Chuck Lever