Re: tmpfs hang after v6.12-rc6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 16, 2024 at 01:33:19AM +0900, Jeongjun Park wrote:
> 2024년 11월 16일 (토) 오전 1:25, Chuck Lever <chuck.lever@xxxxxxxxxx>님이 작성:
> >
> > On Fri, Nov 15, 2024 at 11:04:56AM -0500, Chuck Lever wrote:
> > > I've found that NFS access to an exported tmpfs file system hangs
> > > indefinitely when the client first performs a GETATTR. The hanging
> > > nfsd thread is waiting for the inode lock in shmem_getattr():
> > >
> > > task:nfsd            state:D stack:0     pid:1775  tgid:1775  ppid:2      flags:0x00004000
> > > Call Trace:
> > >  <TASK>
> > >  __schedule+0x770/0x7b0
> > >  schedule+0x33/0x50
> > >  schedule_preempt_disabled+0x19/0x30
> > >  rwsem_down_read_slowpath+0x206/0x230
> > >  down_read+0x3f/0x60
> > >  shmem_getattr+0x84/0xf0
> > >  vfs_getattr_nosec+0x9e/0xc0
> > >  vfs_getattr+0x49/0x50
> > >  fh_getattr+0x43/0x50 [nfsd]
> > >  fh_fill_pre_attrs+0x4e/0xd0 [nfsd]
> > >  nfsd4_open+0x51f/0x910 [nfsd]
> > >  nfsd4_proc_compound+0x492/0x5d0 [nfsd]
> > >  nfsd_dispatch+0x117/0x1f0 [nfsd]
> > >  svc_process_common+0x3b2/0x5e0 [sunrpc]
> > >  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
> > >  svc_process+0xcf/0x130 [sunrpc]
> > >  svc_recv+0x64e/0x750 [sunrpc]
> > >  ? __wake_up_bit+0x4b/0x60
> > >  ? __pfx_nfsd+0x10/0x10 [nfsd]
> > >  nfsd+0xc6/0xf0 [nfsd]
> > >  kthread+0xed/0x100
> > >  ? __pfx_kthread+0x10/0x10
> > >  ret_from_fork+0x2e/0x50
> > >  ? __pfx_kthread+0x10/0x10
> > >  ret_from_fork_asm+0x1a/0x30
> > >  </TASK>
> > >
> > > I bisected the problem to:
> > >
> > > d949d1d14fa281ace388b1de978e8f2cd52875cf is the first bad commit
> > > commit d949d1d14fa281ace388b1de978e8f2cd52875cf
> > > Author:     Jeongjun Park <aha310510@xxxxxxxxx>
> > > AuthorDate: Mon Sep 9 21:35:58 2024 +0900
> > > Commit:     Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > CommitDate: Mon Oct 28 21:40:39 2024 -0700
> > >
> > >     mm: shmem: fix data-race in shmem_getattr()
> > >
> > > ...
> > >
> > >     Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@xxxxxxxxx
> > >     Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat")
> > >     Signed-off-by: Jeongjun Park <aha310510@xxxxxxxxx>
> > >     Reported-by: syzbot <syzkaller@xxxxxxxxxxxxxxx>
> > >     Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > >     Cc: Yu Zhao <yuzhao@xxxxxxxxxx>
> > >     Cc: <stable@xxxxxxxxxxxxxxx>
> > >     Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > >
> > > which first appeared in v6.12-rc6, and adds the line that is waiting
> > > on the inode lock when my NFS server hangs.
> > >
> > > I haven't yet found the process that is holding the inode lock for
> > > this inode.
> >
> > It is likely that the caller (nfsd4_open()-> fh_fill_pre_attrs()) is
> > already holding the inode semaphore in this case.
> 
> Thanks for letting me know!
> 
> It seems that the previous patch I wrote was wrong in how to prevent data-race.
> It seems that the problem occurs in nfsd because nfsd4_create_file() already
> holds the inode_lock.
> 
> After further analysis, I found that this data-race mainly occurs when
> vfs_statx_path does not acquire the inode_lock, and in other filesystems,
> it is confirmed that inode_lock is acquired in many cases, so I will send a
> new patch that fixes this problem right away.

Thanks for your quick response!

My brief sample of file system ->getattr methods shows that these
functions do not grab the inode semaphore at all when calling
generic_fillattr(). Likely they expect the method's caller to take
it.

I strongly prefer to see this commit reverted for v6.12-rc first,
and then the new fix should be merged via a normal merge window to
permit a lengthy period of testing.


> > > Because this commit addresses only a KCSAN splat that has been
> > > present since v4.3, and does not address a reported behavioral
> > > issue, I respectfully request that this commit be reverted
> > > immediately so that it does not appear in v6.12 final.
> > > Troubleshooting and testing should continue until a fix to the KCSAN
> > > issue can be found that does not deadlock NFS exports of tmpfs.
> > >
> > >
> > > --
> > > Chuck Lever
> > >
> >
> > --
> > Chuck Lever

-- 
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux