On Sat, Nov 16, 2024 at 01:33:19AM +0900, Jeongjun Park wrote: > 2024년 11월 16일 (토) 오전 1:25, Chuck Lever <chuck.lever@xxxxxxxxxx>님이 작성: > > > > On Fri, Nov 15, 2024 at 11:04:56AM -0500, Chuck Lever wrote: > > > I've found that NFS access to an exported tmpfs file system hangs > > > indefinitely when the client first performs a GETATTR. The hanging > > > nfsd thread is waiting for the inode lock in shmem_getattr(): > > > > > > task:nfsd state:D stack:0 pid:1775 tgid:1775 ppid:2 flags:0x00004000 > > > Call Trace: > > > <TASK> > > > __schedule+0x770/0x7b0 > > > schedule+0x33/0x50 > > > schedule_preempt_disabled+0x19/0x30 > > > rwsem_down_read_slowpath+0x206/0x230 > > > down_read+0x3f/0x60 > > > shmem_getattr+0x84/0xf0 > > > vfs_getattr_nosec+0x9e/0xc0 > > > vfs_getattr+0x49/0x50 > > > fh_getattr+0x43/0x50 [nfsd] > > > fh_fill_pre_attrs+0x4e/0xd0 [nfsd] > > > nfsd4_open+0x51f/0x910 [nfsd] > > > nfsd4_proc_compound+0x492/0x5d0 [nfsd] > > > nfsd_dispatch+0x117/0x1f0 [nfsd] > > > svc_process_common+0x3b2/0x5e0 [sunrpc] > > > ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] > > > svc_process+0xcf/0x130 [sunrpc] > > > svc_recv+0x64e/0x750 [sunrpc] > > > ? __wake_up_bit+0x4b/0x60 > > > ? __pfx_nfsd+0x10/0x10 [nfsd] > > > nfsd+0xc6/0xf0 [nfsd] > > > kthread+0xed/0x100 > > > ? __pfx_kthread+0x10/0x10 > > > ret_from_fork+0x2e/0x50 > > > ? __pfx_kthread+0x10/0x10 > > > ret_from_fork_asm+0x1a/0x30 > > > </TASK> > > > > > > I bisected the problem to: > > > > > > d949d1d14fa281ace388b1de978e8f2cd52875cf is the first bad commit > > > commit d949d1d14fa281ace388b1de978e8f2cd52875cf > > > Author: Jeongjun Park <aha310510@xxxxxxxxx> > > > AuthorDate: Mon Sep 9 21:35:58 2024 +0900 > > > Commit: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > > CommitDate: Mon Oct 28 21:40:39 2024 -0700 > > > > > > mm: shmem: fix data-race in shmem_getattr() > > > > > > ... > > > > > > Link: https://lkml.kernel.org/r/20240909123558.70229-1-aha310510@xxxxxxxxx > > > Fixes: 44a30220bc0a ("shmem: recalculate file inode when fstat") > > > Signed-off-by: Jeongjun Park <aha310510@xxxxxxxxx> > > > Reported-by: syzbot <syzkaller@xxxxxxxxxxxxxxx> > > > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > > > Cc: Yu Zhao <yuzhao@xxxxxxxxxx> > > > Cc: <stable@xxxxxxxxxxxxxxx> > > > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > > > > > which first appeared in v6.12-rc6, and adds the line that is waiting > > > on the inode lock when my NFS server hangs. > > > > > > I haven't yet found the process that is holding the inode lock for > > > this inode. > > > > It is likely that the caller (nfsd4_open()-> fh_fill_pre_attrs()) is > > already holding the inode semaphore in this case. > > Thanks for letting me know! > > It seems that the previous patch I wrote was wrong in how to prevent data-race. > It seems that the problem occurs in nfsd because nfsd4_create_file() already > holds the inode_lock. > > After further analysis, I found that this data-race mainly occurs when > vfs_statx_path does not acquire the inode_lock, and in other filesystems, > it is confirmed that inode_lock is acquired in many cases, so I will send a > new patch that fixes this problem right away. Thanks for your quick response! My brief sample of file system ->getattr methods shows that these functions do not grab the inode semaphore at all when calling generic_fillattr(). Likely they expect the method's caller to take it. I strongly prefer to see this commit reverted for v6.12-rc first, and then the new fix should be merged via a normal merge window to permit a lengthy period of testing. > > > Because this commit addresses only a KCSAN splat that has been > > > present since v4.3, and does not address a reported behavioral > > > issue, I respectfully request that this commit be reverted > > > immediately so that it does not appear in v6.12 final. > > > Troubleshooting and testing should continue until a fix to the KCSAN > > > issue can be found that does not deadlock NFS exports of tmpfs. > > > > > > > > > -- > > > Chuck Lever > > > > > > > -- > > Chuck Lever -- Chuck Lever