Re: nfs client deadloop on 6.6.53

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-10-14 at 06:10 +0800, Wang Yugui wrote:
> Hi,
> 
> > On Tue, 2024-10-08 at 21:27 +0800, Wang Yugui wrote:
> > > Hi,
> > > 
> > > nfs client deadloop on 6.6.53.
> > > 
> > > [ 9409.381322] sysrq: Show Blocked State
> > > [ 9409.386146] task:bash??????????? state:D stack:0???? pid:2323?
> > > ppid:2226?? flags:0x00004002
> > > [ 9409.395225] Call Trace:
> > > [ 9409.398376]? <TASK>
> > > [ 9409.401172]? __schedule+0x232/0x5d0
> > > [ 9409.405370]? schedule+0x5e/0xd0
> > > [ 9409.409217]? schedule_timeout+0x8c/0x170
> > > [ 9409.413837]? ? __pfx_process_timeout+0x10/0x10
> > > [ 9409.418989]? msleep+0x3b/0x50
> > > [ 9409.422656]? ff_layout_pg_init_read+0x1c1/0x290
> > > [nfs_layout_flexfiles]
> > > [ 9409.429910]? __nfs_pageio_add_request+0x29b/0x480 [nfs]
> > > [ 9409.435911]? nfs_pageio_add_request+0x221/0x2a0 [nfs]
> > > [ 9409.441715]? nfs_read_add_folio+0x1a3/0x280 [nfs]
> > > [ 9409.447175]? nfs_readahead+0x235/0x2d0 [nfs]
> > > [ 9409.452193]? read_pages+0x56/0x2c0
> > > [ 9409.456298]? page_cache_ra_unbounded+0x134/0x1a0
> > > [ 9409.461626]? filemap_get_pages+0xf5/0x3a0
> > > [ 9409.466355]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > > [ 9409.472325]? filemap_read+0xdc/0x350
> > > [ 9409.476614]? ? find_idlest_group+0x113/0x530
> > > [ 9409.481614]? nfs_file_read+0x74/0xc0 [nfs]
> > > [ 9409.486461]? __kernel_read+0xff/0x2b0
> > > [ 9409.490838]? search_binary_handler+0x70/0x250
> > > [ 9409.495908]? exec_binprm+0x50/0x1a0
> > > [ 9409.500102]? bprm_execve.part.0+0x17d/0x230
> > > [ 9409.504993]? do_execveat_common.isra.0+0x1a2/0x240
> > > [ 9409.510489]? __x64_sys_execve+0x37/0x50
> > > [ 9409.515026]? do_syscall_64+0x5a/0x90
> > > [ 9409.519298]? ? __count_memcg_events+0x4c/0xa0
> > > [ 9409.524348]? ? mm_account_fault+0x6c/0x100
> > > [ 9409.529129]? ? handle_mm_fault+0x154/0x280
> > > [ 9409.533903]? ? do_user_addr_fault+0x35f/0x680
> > > [ 9409.538935]? ? exc_page_fault+0x69/0x150
> > > [ 9409.543537]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > [ 9409.549277] RIP: 0033:0x7f57378d987b
> > > [ 9409.553572] RSP: 002b:00007ffdb5978708 EFLAGS: 00000246
> > > ORIG_RAX:
> > > 000000000000003b
> > > [ 9409.561847] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > > 00007f57378d987b
> > > [ 9409.569690] RDX: 000055d26e403600 RSI: 000055d26e5cdc50 RDI:
> > > 000055d26e6ce7f0
> > > [ 9409.577534] RBP: 000055d26e6ce7f0 R08: 000055d26e5a5b60 R09:
> > > 0000000000000000
> > > [ 9409.585375] R10: 0000000000000008 R11: 0000000000000246 R12:
> > > 00000000ffffffff
> > > [ 9409.593208] R13: 000055d26e5cdc50 R14: 000055d26e403600 R15:
> > > 000055d26e6ceb40
> > > [ 9409.601047]? </TASK>
> > > [ 9409.603946] task:bash??????????? state:D stack:0???? pid:2550?
> > > ppid:2462?? flags:0x00004002
> > > [ 9409.613027] Call Trace:
> > > [ 9409.616185]? <TASK>
> > > [ 9409.618983]? __schedule+0x232/0x5d0
> > > [ 9409.623186]? schedule+0x5e/0xd0
> > > [ 9409.627033]? io_schedule+0x46/0x70
> > > [ 9409.631140]? folio_wait_bit_common+0x133/0x390
> > > [ 9409.636294]? ? folio_wait_bit_common+0x100/0x390
> > > [ 9409.641624]? ? nfs4_do_open+0xcd/0x210 [nfsv4]
> > > [ 9409.646854]? ? __pfx_wake_page_function+0x10/0x10
> > > [ 9409.652268]? filemap_update_page+0x2bc/0x300
> > > [ 9409.657242]? filemap_get_pages+0x21d/0x3a0
> > > [ 9409.662042]? ? __nfs_lookup_revalidate+0x53/0x140 [nfs]
> > > [ 9409.668010]? filemap_read+0xdc/0x350
> > > [ 9409.672299]? nfs_file_read+0x74/0xc0 [nfs]
> > > [ 9409.677126]? __kernel_read+0xff/0x2b0
> > > [ 9409.681476]? search_binary_handler+0x70/0x250
> > > [ 9409.686526]? exec_binprm+0x50/0x1a0
> > > [ 9409.690702]? bprm_execve.part.0+0x17d/0x230
> > > [ 9409.695573]? do_execveat_common.isra.0+0x1a2/0x240
> > > [ 9409.701047]? __x64_sys_execve+0x37/0x50
> > > [ 9409.705559]? do_syscall_64+0x5a/0x90
> > > [ 9409.709805]? ? do_user_addr_fault+0x35f/0x680
> > > [ 9409.714834]? ? exc_page_fault+0x69/0x150
> > > [ 9409.719414]? entry_SYSCALL_64_after_hwframe+0x78/0xe2
> > > [ 9409.725126] RIP: 0033:0x7f3c492d987b
> > > [ 9409.729362] RSP: 002b:00007ffc6413a458 EFLAGS: 00000246
> > > ORIG_RAX:
> > > 000000000000003b
> > > [ 9409.737609] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
> > > 00007f3c492d987b
> > > [ 9409.745429] RDX: 000055c6a8f07600 RSI: 000055c6a90e72a0 RDI:
> > > 000055c6a90f7890
> > > [ 9409.753256] RBP: 000055c6a90f7890 R08: 000055c6a90f6250 R09:
> > > 0000000000000000
> > > [ 9409.761078] R10: 0000000000000008 R11: 0000000000000246 R12:
> > > 00000000ffffffff
> > > [ 9409.768904] R13: 000055c6a90e72a0 R14: 000055c6a8f07600 R15:
> > > 000055c6a90e1ea0
> > > [ 9409.776732]? </TASK>
> > > 
> > > Notice:
> > > 1, nfs server:? kernel 6.6.54
> > > pnfs optin in the service side /etc/exports.
> > > 
> > 
> > This is not a client bug.
> > 
> > The client has no choice other than to retry here. It is being
> > given a
> > layout that it cannot use (probably because it has already
> > discovered
> > that it cannot talk to the data server), but it is also being told
> > by
> > the same layout that it is not allowed to fall back to doing I/O
> > through the metadata server.
> > 
> > IOW: This bug needs to be fixed on the server, which is handing out
> > a
> > layout that is impossible to satisfy.
> 
> It seems that pnfs need nfs3/udp.
> but the nfs3/udp is disabled on this server.

That is incorrect. There should be no need to enable RPC over UDP.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux