On 2/25/25 3:51 AM, Li Lingfeng wrote: > Hi. > Recently, during fault injection testing, we found an issue where nfsd > process cannot exit when /proc/fs/nfsd/threads is written to 0, causing > other processes to be unable to acquire nfsd_mutex, leading to a hungtask. > This is the stack trace of the nfsd process: > PID: 107326 TASK: ffff8881013a4040 CPU: 1 COMMAND: "nfsd" > #0 [ffffc900077077d8] __schedule at ffffffff9c6434b6 > #1 [ffffc900077078d8] schedule at ffffffff9c643e28 > #2 [ffffc90007707900] schedule_timeout at ffffffff9c64bf16 > #3 [ffffc90007707a68] wait_for_common at ffffffff9c645346 > #4 [ffffc90007707b38] nfsd4_cld_create at ffffffff9b80626a > #5 [ffffc90007707c40] nfsd4_open_confirm at ffffffff9b7f41d9 > #6 [ffffc90007707ce0] nfsd4_proc_compound at ffffffff9b7c872a > #7 [ffffc90007707d80] nfsd_dispatch at ffffffff9b79f20d > #8 [ffffc90007707dc8] svc_process_common at ffffffff9c4ad9fb > #9 [ffffc90007707ea0] svc_process at ffffffff9c4adf15 > #10 [ffffc90007707ed8] nfsd at ffffffff9b79ba18 > #11 [ffffc90007707f10] kthread at ffffffff9af908c4 > #12 [ffffc90007707f50] ret_from_fork at ffffffff9ae048df > > This is because the nfsdcld process exited abnormally, causing the nfsd > process to wait indefinitely for a downcall response after initiating an > upcall. > Here is the log of nfsdcld: > Jan 4 02:22:29 localhost nfsdcld[696]: cld_message_size invalid upcall > version 0 > Jan 4 02:22:29 localhost systemd[1]: nfsdcld.service: Main process > exited, code=exited, status=1/FAILURE > Jan 4 02:22:29 localhost systemd[1]: nfsdcld.service: Failed with > result 'exit-code'. > > Memory fault injection caused the kernel to report cld_msg in v1 format, > and nfsdcld parsed it incorrectly, leading to an abnormal exit. Without commenting on the timeout question, IMO this failure mode is problematic as well... > // Expected Scenario > nfsd4_client_tracking_init > nn->client_tracking_ops = &nfsd4_cld_tracking_ops; // Initialize to v1 > nfsd4_cld_tracking_init > nfsd4_cld_get_version > cld_pipe_upcall // Request version information from user space > nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v2; // Initialize to v2 > > // Actual Scenario > nfsd4_client_tracking_init > nn->client_tracking_ops = &nfsd4_cld_tracking_ops; // Initialize to v1 > nfsd4_cld_tracking_init > nfsd4_cld_get_version > alloc_cld_upcall // A failure is returned due to memory fault > // injection, and the upcall is skipped. > nfsd4_cld_grace_start > alloc_cld_upcall // A failure is returned due to memory fault > // injection, and the upcall is skipped. > nn->client_tracking_ops = &nfsd4_cld_tracking_ops_v0 // Initialize to v1 > > *I was wondering if the kernel might benefit from having a timeout mechanism > in place to gracefully handle situations where nfsdcld is unable to send a > downcall for certain reasons, ensuring that the nfsd process can exit > properly.* > > > Link: https://lore.kernel.org/all/3e26c767-f347-4dbe-ae04- > aabe8e87af12@xxxxxxxxxx/ > > -- Chuck Lever