Re: [PATCH RESEND v9 2/3] fuse: add optional kernel-enforced timeout for requests

Joanne Koong <joannelkoong@xxxxxxxxx> · Fri, 6 Dec 2024 11:05:43 -0800

On Fri, Dec 6, 2024 at 8:31 AM Etienne Martineau <etmartin4313@xxxxxxxxx> wrote:
>
> From: Joanne Koong <joannelkoong@xxxxxxxxx>
>
> > There are situations where fuse servers can become unresponsive or
> > stuck, for example if the server is deadlocked. Currently, there's no
> > good way to detect if a server is stuck and needs to be killed manually.
>
> I tried this V9 patchset and realized that upon max_request_timeout the connection will be dropped irrespectively if the task is in D state or not. I guess this is expected behavior.

Yes, the connection will be dropped regardless of what state the task is in.

> To me the concerning aspect is when tasks are going in D state because of the consequence when running with hung_task_timeout_secs and hung_task_panic=1.

Could you elaborate on this a bit more? When running with
hung_task_timeout_secs and hung_task_panic=1, how does it cause the
task to go into D state?

>
> >
> > This commit adds an option for enforcing a timeout (in minutes) for
> > requests where if the timeout elapses without the server responding to
> > the request, the connection will be automatically aborted.
> >
> > Please note that these timeouts are not 100% precise. The request may
> > take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the requested max
> > timeout due to how it's internally implemented.
> >
> > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> > Reviewed-by: Bernd Schubert <bschubert@xxxxxxx>
> > ---
> >  fs/fuse/dev.c    | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  fs/fuse/fuse_i.h | 21 +++++++++++++
> >  fs/fuse/inode.c  | 21 +++++++++++++
> >  3 files changed, 122 insertions(+)
> >
> > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> > index 29fc61a072ba..536aa4525e8f 100644
> > --- a/fs/fuse/dev.c
> > +++ b/fs/fuse/dev.c
> > @@ -45,6 +45,82 @@ static struct fuse_dev *fuse_get_dev(struct file *file)
> >       return READ_ONCE(file->private_data);
> >  }
> >
> > +static bool request_expired(struct fuse_conn *fc, struct fuse_req *req)
> > +{
> > +     return jiffies > req->create_time + fc->timeout.req_timeout;
> > +}
> > +
> > +/*
> > + * Check if any requests aren't being completed by the specified request
> > + * timeout. To do so, we:
> > + * - check the fiq pending list
> > + * - check the bg queue
> > + * - check the fpq io and processing lists
> > + *
> > + * To make this fast, we only check against the head request on each list since
> > + * these are generally queued in order of creation time (eg newer requests get
> > + * queued to the tail). We might miss a few edge cases (eg requests transitioning
> > + * between lists, re-sent requests at the head of the pending list having a
> > + * later creation time than other requests on that list, etc.) but that is fine
> > + * since if the request never gets fulfilled, it will eventually be caught.
> > + */
> > +void fuse_check_timeout(struct timer_list *timer)
> > +{
> > +     struct fuse_conn *fc = container_of(timer, struct fuse_conn, timeout.timer);
> Here this timer may get queued and if echo 1 > /sys/fs/fuse/connections/'nn'/abort is done at more or less the same time over the same connection I'm wondering what will happen?
> At least I think we may need timer_delete_sync() instead of timer_delete() in fuse_abort_conn() and potentially call it from the top of fuse_abort_conn() instead.

I don't think this is an issue because there's still a reference on
the "struct fuse_conn" when fuse_abort_conn() is called. The fuse_conn
struct is freed in fuse_conn_put() when the last refcount is dropped,
and in fuse_conn_put() there's this line

if (fc->timeout.req_timeout)
  timer_shutdown_sync(&fc->timeout.timer);

that guarantees the timer is not queued / callback of timer is not
running / cannot be rearmed.

>
> > +     struct fuse_iqueue *fiq = &fc->iq;
> > +     struct fuse_req *req;
> > +     struct fuse_dev *fud;
> > +     struct fuse_pqueue *fpq;
> > +     bool expired = false;
> > +     int i;
> > +
> > +     spin_lock(&fiq->lock);
> Do we need spin_lock_irq instead bcos this is irq context no?

Yeah, I missed that spin_lock doesn't disable interrupts. Sergey noted
this as well in [1] and for the next iteration of this patchset, I'm
planning to use use a kthread per server connection, similar to what
the hung task watchdog does.

[1] https://lore.kernel.org/linux-fsdevel/CAJnrk1bwYjHGGNsPwjsd5Kp3iL6ua_hmyN3kFZo1b8OVV9bOpw@xxxxxxxxxxxxxx/

>
> >
> > +static void fuse_init_fc_timeout(struct fuse_conn *fc, struct fuse_fs_context *ctx)
> > +{
> > +     if (ctx->req_timeout) {
> > +             if (check_mul_overflow(ctx->req_timeout * 60, HZ, &fc->timeout.req_timeout))
> > +                     fc->timeout.req_timeout = ULONG_MAX;
> > +             timer_setup(&fc->timeout.timer, fuse_check_timeout, 0);
> > +             mod_timer(&fc->timeout.timer, jiffies + FUSE_TIMEOUT_TIMER_FREQ);
> > +     } else {
> > +             fc->timeout.req_timeout = 0;
> > +     }
> > +}
> > +
> >  int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
> >  {
> >       struct fuse_dev *fud = NULL;
> > @@ -1753,6 +1773,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
> >       fc->destroy = ctx->destroy;
> >       fc->no_control = ctx->no_control;
> >       fc->no_force_umount = ctx->no_force_umount;
> > +     fuse_init_fc_timeout(fc, ctx);
>
> The max_request_timeout is latched in fc->timeout.req_timeout during fuse_fill_super_common() so any further modification are not taking into effect
> Say:
>  sudo bash -c 'echo 100 > /proc/sys/fs/fuse/max_request_timeout'
>  sshfs -o allow_other,default_permissions you@localhost:/home/you/test ./mnt
>  kill -STOP `pidof /usr/lib/openssh/sftp-server`
>  ls ./mnt/
>  ^C
>   # Trying to change back the timeout doesn't have any effect
>  sudo bash -c 'echo 10 > /proc/sys/fs/fuse/max_request_timeout'
>

This is expected behavior. The timeout that's set at mount time will
be the timeout for the duration of the connection.

Thanks,
Joanne

> Thanks
>