Re: tmpfs hang after v6.12-rc6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 16, 2024 at 01:58:17AM +0900, Jeongjun Park wrote:
> Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >
> > On Sat, Nov 16, 2024 at 01:33:19AM +0900, Jeongjun Park wrote:
> > > 2024년 11월 16일 (토) 오전 1:25, Chuck Lever <chuck.lever@xxxxxxxxxx>님이 작성:
> > > >
> > > > On Fri, Nov 15, 2024 at 11:04:56AM -0500, Chuck Lever wrote:
> > > > > I've found that NFS access to an exported tmpfs file system hangs
> > > > > indefinitely when the client first performs a GETATTR. The hanging
> > > > > nfsd thread is waiting for the inode lock in shmem_getattr():
> > > > >
> > > > > task:nfsd            state:D stack:0     pid:1775  tgid:1775  ppid:2      flags:0x00004000
> > > > > Call Trace:
> > > > >  <TASK>
> > > > >  __schedule+0x770/0x7b0
> > > > >  schedule+0x33/0x50
> > > > >  schedule_preempt_disabled+0x19/0x30
> > > > >  rwsem_down_read_slowpath+0x206/0x230
> > > > >  down_read+0x3f/0x60
> > > > >  shmem_getattr+0x84/0xf0
> > > > >  vfs_getattr_nosec+0x9e/0xc0
> > > > >  vfs_getattr+0x49/0x50
> > > > >  fh_getattr+0x43/0x50 [nfsd]
> > > > >  fh_fill_pre_attrs+0x4e/0xd0 [nfsd]
> > > > >  nfsd4_open+0x51f/0x910 [nfsd]
> > > > >  nfsd4_proc_compound+0x492/0x5d0 [nfsd]
> > > > >  nfsd_dispatch+0x117/0x1f0 [nfsd]
> > > > >  svc_process_common+0x3b2/0x5e0 [sunrpc]
> > > > >  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
> > > > >  svc_process+0xcf/0x130 [sunrpc]
> > > > >  svc_recv+0x64e/0x750 [sunrpc]
> > > > >  ? __wake_up_bit+0x4b/0x60
> > > > >  ? __pfx_nfsd+0x10/0x10 [nfsd]
> > > > >  nfsd+0xc6/0xf0 [nfsd]
> > > > >  kthread+0xed/0x100
> > > > >  ? __pfx_kthread+0x10/0x10
> > > > >  ret_from_fork+0x2e/0x50
> > > > >  ? __pfx_kthread+0x10/0x10
> > > > >  ret_from_fork_asm+0x1a/0x30
> > > > >  </TASK>
> > > > >
> > > > > I bisected the problem to:
> > > > >
> > > > > d949d1d14fa281ace388b1de978e8f2cd52875cf is the first bad commit
> > > > > commit d949d1d14fa281ace388b1de978e8f2cd52875cf
> > > > > Author:     Jeongjun Park <aha310510@xxxxxxxxx>
> > > > > AuthorDate: Mon Sep 9 21:35:58 2024 +0900
> > > > > Commit:     Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > > > CommitDate: Mon Oct 28 21:40:39 2024 -0700
> > > > >
> > > > >     mm: shmem: fix data-race in shmem_getattr()
> > > > >
> > > > > ...
> > > > >
> > > > >     Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@xxxxxxxxx
> > > > >     Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
> > > > >     Signed-off-by: Jeongjun Park <aha310510@xxxxxxxxx>
> > > > >     Reported-by: syzbot <syzkaller@xxxxxxxxxxxxxxx>
> > > > >     Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > > > >     Cc: Yu Zhao <yuzhao@xxxxxxxxxx>
> > > > >     Cc: <stable@xxxxxxxxxxxxxxx>
> > > > >     Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > > >
> > > > > which first appeared in v6.12-rc6, and adds the line that is waiting
> > > > > on the inode lock when my NFS server hangs.
> > > > >
> > > > > I haven't yet found the process that is holding the inode lock for
> > > > > this inode.
> > > >
> > > > It is likely that the caller (nfsd4_open()-> fh_fill_pre_attrs()) is
> > > > already holding the inode semaphore in this case.
> > >
> > > Thanks for letting me know!
> > >
> > > It seems that the previous patch I wrote was wrong in how to prevent data-race.
> > > It seems that the problem occurs in nfsd because nfsd4_create_file() already
> > > holds the inode_lock.
> > >
> > > After further analysis, I found that this data-race mainly occurs when
> > > vfs_statx_path does not acquire the inode_lock, and in other filesystems,
> > > it is confirmed that inode_lock is acquired in many cases, so I will send a
> > > new patch that fixes this problem right away.
> >
> > Thanks for your quick response!
> >
> > My brief sample of file system ->getattr methods shows that these
> > functions do not grab the inode semaphore at all when calling
> > generic_fillattr(). Likely they expect the method's caller to take
> > it.
> >
> > I strongly prefer to see this commit reverted for v6.12-rc first,
> > and then the new fix should be merged via a normal merge window to
> > permit a lengthy period of testing.
> >
> 
> Hmm... Of course, revert this patch is not a bad idea, but I think that
> patching it like below can effectively prevent data-race without causing
> recursive locking:
> 
> ---
>  mm/shmem.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index e87f5d6799a7..d061f8b34d49 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1153,6 +1153,12 @@ static int shmem_getattr(struct mnt_idmap *idmap,
>  {
>     struct inode *inode = path->dentry->d_inode;
>     struct shmem_inode_info *info = SHMEM_I(inode);
> +   bool inode_locked = NULL;
> +
> +   if (!inode_is_locked(inode)) {
> +       inode_lock_shared(inode);
> +       inode_locked = true;
> +   }
> 
>     if (info->alloced - info->swapped != inode->i_mapping->nrpages)
>         shmem_recalc_inode(inode, 0, 0);
> @@ -1166,9 +1172,7 @@ static int shmem_getattr(struct mnt_idmap *idmap,
>     stat->attributes_mask |= (STATX_ATTR_APPEND |
>             STATX_ATTR_IMMUTABLE |
>             STATX_ATTR_NODUMP);
> -   inode_lock_shared(inode);
>     generic_fillattr(idmap, request_mask, inode, stat);
> -   inode_unlock_shared(inode);
> 
>     if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0))
>         stat->blksize = HPAGE_PMD_SIZE;
> @@ -1179,6 +1183,9 @@ static int shmem_getattr(struct mnt_idmap *idmap,
>         stat->btime.tv_nsec = info->i_crtime.tv_nsec;
>     }
> 
> +   if (inode_locked)
> +       inode_unlock_shared(inode);
> +
>     return 0;
>  }
> 
> --
> 
> What do you think?

As I said before, I've failed to find any file system getattr method
that explicitly takes the inode's semaphore around a
generic_fillattr() call. My understanding is that these methods
assume that their caller handles appropriate serialization.
Therefore, taking the inode semaphore at all in shmem_getattr()
seems to me to be the wrong approach entirely.

The point of reverting immediately is that any fix can't possibly
get the review and testing it deserves three days before v6.12
becomes final. Also, it's not clear what the rush to fix the
KCSAN splat is; according to the Fixes: tag, this issue has been
present for years without causing overt problems. It doesn't feel
like this change belongs in an -rc in the first place.

Please revert d949d1d14fa2, then let's discuss a proper fix in a
separate thread. Thanks!


> Regards,
> 
> Jeongjun Park
> 
> >
> > > > > Because this commit addresses only a KCSAN splat that has been
> > > > > present since v4.3, and does not address a reported behavioral
> > > > > issue, I respectfully request that this commit be reverted
> > > > > immediately so that it does not appear in v6.12 final.
> > > > > Troubleshooting and testing should continue until a fix to the KCSAN
> > > > > issue can be found that does not deadlock NFS exports of tmpfs.
> > > > >
> > > > >
> > > > > --
> > > > > Chuck Lever
> > > > >
> > > >
> > > > --
> > > > Chuck Lever
> >
> > --
> > Chuck Lever

-- 
Chuck Lever




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux