Re: [PATCH v2] fuse: Abort connection if FUSE server get stuck

Joanne Koong <joannelkoong@xxxxxxxxx> · Thu, 12 Dec 2024 15:30:19 -0800

On Thu, Dec 12, 2024 at 2:46 PM Etienne Martineau
<etmartin4313@xxxxxxxxx> wrote:
>
> On Thu, Dec 12, 2024 at 4:48 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
> >
> > On Wed, Dec 11, 2024 at 3:04 PM Etienne Martineau
> > <etmartin4313@xxxxxxxxx> wrote:
> > >
> > > On Wed, Dec 11, 2024 at 5:04 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
> > > >
> > > > On Wed, Dec 11, 2024 at 11:45 AM <etmartin4313@xxxxxxxxx> wrote:
> > > > >
> > > > > From: Etienne Martineau <etmartin4313@xxxxxxxxx>
> > > > >
> > > > > This patch abort connection if HUNG_TASK_PANIC is set and a FUSE server
> > > > > is getting stuck for too long. A slow FUSE server may tripped the
> > > > > hang check timer for legitimate reasons hence consider disabling
> > > > > HUNG_TASK_PANIC in that scenario.
> > > > >
> > > > > Without this patch, an unresponsive / buggy / malicious FUSE server can
> > > > > leave the clients in D state for a long period of time and on system where
> > > > > HUNG_TASK_PANIC is set, trigger a catastrophic reload.
> > > > >
> > > > > So, if HUNG_TASK_PANIC checking is enabled, we should wake up periodically
> > > > > to abort connections that exceed the timeout value which is define to be
> > > > > half the HUNG_TASK_TIMEOUT period, which keeps overhead low. The timer
> > > > > is per connection and runs only if there are active FUSE request pending.
> > > >
> > > > Hi Etienne,
> > > >
> > > > For your use case, does the generic request timeouts logic and
> > > > max_request_timeout systemctl implemented in [1] and [2] not suffice?
> > > > IMO I don't think we should have logic specifically checking for hung
> > > > task timeouts in fuse, if the generic solution can be used.
> > > >
> > > > Thanks,
> > > > Joanne
> > >
> > > We need a way to avoid catastrophic reloads on systems where HUNG_TASK_PANIC
> > > is set while a buggy / malicious FUSE server stops responding.
> > > I would argue that this is much needed in stable branches as well...
> > >
> > > For that reason, I believe we need to keep things simple for step #1
> > > e.g. there is no
> > > need to introduce another knob as we already have HUNG_TASK_TIMEOUT which
> > > holds the source of truth.
> > >
> > > IMO introducing those new knobs will put an unnecessary burden on sysadmins into
> > > something that is error prone because unlike
> > >   CONFIG_DETECT_HUNG_TASK=y
> > >   CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > which is built-in, the "default_request_timeout" /
> > > "max_request_timeout" needs to be
> > > set appropriately after every reboot and failure to do so may have
> > > nasty consequences.
> >
> > imo, it is not important to directly defend against the hung task case
> > inside the fuse code itself. imo, the generic timeout should be used
> > instead. As I understand it, hung task panic is mostly enabled for
> > debug purposes and is enabled through a sysctl. imo, if the system
> > admin enables the hung task panic sysctl value, then it is not too
> > much of a burden for them to also set the fuse max request timeout
> > sysctl.
> >
> >
> > Thanks,
> > Joanne
> >
> Yes, based on the comments received so far I agree that generic timeout is the
> prefered approach. Looks like we are amongst the few that run production
> systems with hung task panic set. So yeah, I will match fuse max request
> timeout with hung task timeout to get the equivalent behavior.

Sounds great. Just FYI, the timeouts in fuse won't be 100% precise -
they'll have an upper margin of error associated with it (this is
included in the documentation for the sysctl, since it's very
non-obvious). For example, if the max request timeout is set to 600
seconds, it may fire off a little after 600 seconds. So it'd be best
if you set the fuse max request timeout to be below the hung task
timeout to be sure. IIRC, Sergey is doing the same thing [1].

[1] https://lore.kernel.org/linux-fsdevel/20241128115455.GG10431@xxxxxxxxxx/

>
> On a slightly different matter, I realized that in some scenarios
> there is no benefit
> in stopping the timer when reaching the last request because another
> request can come
> right after and then we have to start the timer once again which keeps bouncing
> between cancel_delayed_work_sync() and queue_delayed_work().
>
> So I think it's best to stick with your approach of starting the timer
> when the connection
> is initially established. I can re-work this patch if needed?

Thanks for testing out the timeout functionality. I'm planning to
submit v10 of the generic timeout patch to use workqueues early next
week. The time granularity will also be changed to work in seconds
instead of minutes, as noted for Sergey's and your use case. I'll make
sure you get cc'ed on that patchset.

>
> I've been doing some testing and so far I hit timeout in bg_queue and
> fpq->processing
> but I cannot trigger timeouts in fiq->pending somehow?

You can trigger the fiq->pending timeout by having your fuse server
never read from /dev/fuse, which will keep the request on the
fiq->pending list when the timeout hits. The request is only taken off
the fiq->pending list when fuse reads a request into the server's
buffer (see fuse_dev_do_read()).

Thanks,
Joanne
>
> thanks
> Etienne