Re: [PATCH] fuse: Abort connection if FUSE server get stuck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/10/24 18:16, etmartin4313@xxxxxxxxx wrote:
> From: Etienne Martineau <etmartin4313@xxxxxxxxx>
> 
> This patch abort connection if HUNG_TASK_PANIC is set and a FUSE server
> is getting stuck for too long.
> 
> Without this patch, an unresponsive / buggy / malicious FUSE server can
> leave the clients in D state for a long period of time and on system where
> HUNG_TASK_PANIC is set, trigger a catastrophic reload.
> 
> So, if HUNG_TASK_PANIC checking is enabled, we should wake up periodically
> to abort connections that exceed the timeout value which is define to be
> half the HUNG_TASK_TIMEOUT period, which keeps overhead low.
> 
> This patch introduce a list of request waiting for answer that is time
> sorted to minimize the overhead.
> 
> When HUNG_TASK_PANIC is enable there is a timeout check per connection
> that is running at low frequency only if there are active FUSE request
> pending.
> 
> A FUSE client can get into D state as such ( see below Scenario #1 / #2 )
>  1) request_wait_answer() -> wait_event() is UNINTERRUPTIBLE
>     OR
>  2) request_wait_answer() -> wait_event_(interruptible / killable) is head
>     of line blocking for subsequent clients accessing the same file


I don't think that will help you for fuse background requests.

[422820.431981] INFO: task dd:1590644 blocked for more than 120 seconds.
[422820.436556]       Not tainted 6.13.0-rc1+ #92
[422820.439189] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[422820.446822] task:dd              state:D stack:27440 pid:1590644 tgid:1590644 ppid:1590478 flags:0x00000002
[422820.456782] Call Trace:
[422820.459467]  <TASK>
[422820.461667]  __schedule+0x1b42/0x25b0
[422820.465312]  schedule+0xb5/0x260
[422820.468568]  schedule_preempt_disabled+0x19/0x30
[422820.473033]  rwsem_down_write_slowpath+0x8a6/0x12b0
[422820.477644]  ? generic_file_write_iter+0x82/0x240
[422820.481774]  down_write+0x16f/0x1a0
[422820.486756]  generic_file_write_iter+0x82/0x240
[422820.490412]  ? fuse_file_read_iter+0x490/0x490 [fuse]
[422820.493021]  vfs_write+0x7c8/0xb70
[422820.494389]  ? fuse_file_read_iter+0x490/0x490 [fuse]
[422820.497003]  ksys_write+0xce/0x170
[422820.500110]  do_syscall_64+0x81/0x120
[422820.502941]  ? irqentry_exit_to_user_mode+0x133/0x180
[422820.505504]  entry_SYSCALL_64_after_hwframe+0x4b/0x53


Joannes timeout patches are more generic and handle these as well.


Thanks,
Bernd




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux