On 12/6/24 1:09 AM, Etienne wrote: > On Wed, Dec 4, 2024 at 8:51 PM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote: >> >> >> >> On 12/5/24 12:43 AM, etmartin4313@xxxxxxxxx wrote: >>> From: Etienne Martineau <etmartin4313@xxxxxxxxx> >>> >>> If hung task checking is enabled and FUSE server stops responding for a >>> long period of time, the hung task timer may fire towards the FUSE clients >>> and trigger stack dumps that unnecessarily alarm the user. >> >> Isn't that expected that users shall be notified that there's something >> wrong with the FUSE service (because of either buggy implementation or >> malicious purpose)? Or is it expected that the normal latency of >> handling a FUSE request is more than 30 seconds? > > In one way you're right because seeing those stack dumps tells you > right away that something is wrong with a FUSE service. > Having said that, with many FUSE services running, those stack dumps > are not helpful at pointing out which of the FUSE services is having > issues. > > Maybe we should instead have proper debug in place to dump the FUSE > connection so that user can abort via > /sys/fs/fuse/connections/'nn'/abort > Something like "pr_warn("Fuse connection %u not responding\n", fc->dev);" maybe? If the goal is to identifying which fuse connection is problematic, then yes, it is not that easy to do that as the hung task has no concept of underlying filesystem. It is not what the hung task mechanism needs to do. To do that, at least you should record the per-request timestamp when the request is submitted, or a complete timeout mechanism in FUSE as pointed by Joanne [1]. [1] https://lore.kernel.org/linux-fsdevel/20241114191332.669127-1-joannelkoong@xxxxxxxxx/ > > Also, now that you are pointing out a malicious implementation, I > realized that on a system with 'hung_task_panic' set, a non-privileged > user can easily trip the hung task timer and force a panic. > > I just tried the following sequence using FUSE sshfs and without this > patch my system went down. > > sudo bash -c 'echo 30 > /proc/sys/kernel/hung_task_timeout_secs' > sudo bash -c 'echo 1 > /proc/sys/kernel/hung_task_panic' > sshfs -o allow_other,default_permissions you@localhost:/home/you/test ./mnt > kill -STOP `pidof /usr/lib/openssh/sftp-server` > ls ./mnt/ > ^C IMHO hung_task_panic shall not be enabled in productive environment. -- Thanks, Jingbo