On Thu, Dec 5, 2024 at 9:10 AM Etienne <etmartin4313@xxxxxxxxx> wrote: > > On Wed, Dec 4, 2024 at 8:51 PM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On 12/5/24 12:43 AM, etmartin4313@xxxxxxxxx wrote: > > > From: Etienne Martineau <etmartin4313@xxxxxxxxx> > > > > > > If hung task checking is enabled and FUSE server stops responding for a > > > long period of time, the hung task timer may fire towards the FUSE clients > > > and trigger stack dumps that unnecessarily alarm the user. > > > > Isn't that expected that users shall be notified that there's something > > wrong with the FUSE service (because of either buggy implementation or > > malicious purpose)? Or is it expected that the normal latency of > > handling a FUSE request is more than 30 seconds? > > In one way you're right because seeing those stack dumps tells you > right away that something is wrong with a FUSE service. > Having said that, with many FUSE services running, those stack dumps > are not helpful at pointing out which of the FUSE services is having > issues. > > Maybe we should instead have proper debug in place to dump the FUSE > connection so that user can abort via > /sys/fs/fuse/connections/'nn'/abort > Something like "pr_warn("Fuse connection %u not responding\n", fc->dev);" maybe? Having some identifying information about which connection is unresponsive seems useful, but I don't see a straightforward way of implementing this without adding additional per-request overhead. > > Also, now that you are pointing out a malicious implementation, I > realized that on a system with 'hung_task_panic' set, a non-privileged > user can easily trip the hung task timer and force a panic. > > I just tried the following sequence using FUSE sshfs and without this > patch my system went down. > > sudo bash -c 'echo 30 > /proc/sys/kernel/hung_task_timeout_secs' > sudo bash -c 'echo 1 > /proc/sys/kernel/hung_task_panic' > sshfs -o allow_other,default_permissions you@localhost:/home/you/test ./mnt > kill -STOP `pidof /usr/lib/openssh/sftp-server` > ls ./mnt/ > ^C I'm not sure if this addresses your particular use case, but there's a patch upstream that adds request timeouts https://lore.kernel.org/linux-fsdevel/20241114191332.669127-1-joannelkoong@xxxxxxxxx/ This can be set globally via sysctls (eg "/proc/sys/fs/fuse/max_request_timeout") or on a per-server basis. If the timeout elapses and the request has not been fulfilled (eg malicious or buggy fuse server), the kernel will abort the connection automatically. Thanks, Joanne > > thanks, > Etienne >