On 12/10/24 18:16, etmartin4313@xxxxxxxxx wrote: > From: Etienne Martineau <etmartin4313@xxxxxxxxx> > > This patch abort connection if HUNG_TASK_PANIC is set and a FUSE server > is getting stuck for too long. > > Without this patch, an unresponsive / buggy / malicious FUSE server can > leave the clients in D state for a long period of time and on system where > HUNG_TASK_PANIC is set, trigger a catastrophic reload. > > So, if HUNG_TASK_PANIC checking is enabled, we should wake up periodically > to abort connections that exceed the timeout value which is define to be > half the HUNG_TASK_TIMEOUT period, which keeps overhead low. > > This patch introduce a list of request waiting for answer that is time > sorted to minimize the overhead. > > When HUNG_TASK_PANIC is enable there is a timeout check per connection > that is running at low frequency only if there are active FUSE request > pending. > > A FUSE client can get into D state as such ( see below Scenario #1 / #2 ) > 1) request_wait_answer() -> wait_event() is UNINTERRUPTIBLE > OR > 2) request_wait_answer() -> wait_event_(interruptible / killable) is head > of line blocking for subsequent clients accessing the same file I don't think that will help you for fuse background requests. [422820.431981] INFO: task dd:1590644 blocked for more than 120 seconds. [422820.436556] Not tainted 6.13.0-rc1+ #92 [422820.439189] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [422820.446822] task:dd state:D stack:27440 pid:1590644 tgid:1590644 ppid:1590478 flags:0x00000002 [422820.456782] Call Trace: [422820.459467] <TASK> [422820.461667] __schedule+0x1b42/0x25b0 [422820.465312] schedule+0xb5/0x260 [422820.468568] schedule_preempt_disabled+0x19/0x30 [422820.473033] rwsem_down_write_slowpath+0x8a6/0x12b0 [422820.477644] ? generic_file_write_iter+0x82/0x240 [422820.481774] down_write+0x16f/0x1a0 [422820.486756] generic_file_write_iter+0x82/0x240 [422820.490412] ? fuse_file_read_iter+0x490/0x490 [fuse] [422820.493021] vfs_write+0x7c8/0xb70 [422820.494389] ? fuse_file_read_iter+0x490/0x490 [fuse] [422820.497003] ksys_write+0xce/0x170 [422820.500110] do_syscall_64+0x81/0x120 [422820.502941] ? irqentry_exit_to_user_mode+0x133/0x180 [422820.505504] entry_SYSCALL_64_after_hwframe+0x4b/0x53 Joannes timeout patches are more generic and handle these as well. Thanks, Bernd