Re: [PATCH v2 1/2] fuse: add optional kernel-enforced timeout for requests

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Tue, 6 Aug 2024 17:50:48 +0200

On 8/6/24 00:53, Joanne Koong wrote:
> On Mon, Aug 5, 2024 at 12:32 AM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
>>
>>
>>
>> On 7/30/24 8:23 AM, Joanne Koong wrote:
>>> There are situations where fuse servers can become unresponsive or take
>>> too long to reply to a request. Currently there is no upper bound on
>>> how long a request may take, which may be frustrating to users who get
>>> stuck waiting for a request to complete.
>>>
>>> This commit adds a timeout option (in seconds) for requests. If the
>>> timeout elapses before the server replies to the request, the request
>>> will fail with -ETIME.
>>>
>>> There are 3 possibilities for a request that times out:
>>> a) The request times out before the request has been sent to userspace
>>> b) The request times out after the request has been sent to userspace
>>> and before it receives a reply from the server
>>> c) The request times out after the request has been sent to userspace
>>> and the server replies while the kernel is timing out the request
>>>
>>> While a request timeout is being handled, there may be other handlers
>>> running at the same time if:
>>> a) the kernel is forwarding the request to the server
>>> b) the kernel is processing the server's reply to the request
>>> c) the request is being re-sent
>>> d) the connection is aborting
>>> e) the device is getting released
>>>
>>> Proper synchronization must be added to ensure that the request is
>>> handled correctly in all of these cases. To this effect, there is a new
>>> FR_FINISHING bit added to the request flags, which is set atomically by
>>> either the timeout handler (see fuse_request_timeout()) which is invoked
>>> after the request timeout elapses or set by the request reply handler
>>> (see dev_do_write()), whichever gets there first. If the reply handler
>>> and the timeout handler are executing simultaneously and the reply handler
>>> sets FR_FINISHING before the timeout handler, then the request will be
>>> handled as if the timeout did not elapse. If the timeout handler sets
>>> FR_FINISHING before the reply handler, then the request will fail with
>>> -ETIME and the request will be cleaned up.
>>>
>>> Currently, this is the refcount lifecycle of a request:
>>>
>>> Synchronous request is created:
>>> fuse_simple_request -> allocates request, sets refcount to 1
>>>   __fuse_request_send -> acquires refcount
>>>     queues request and waits for reply...
>>> fuse_simple_request -> drops refcount
>>>
>>> Background request is created:
>>> fuse_simple_background -> allocates request, sets refcount to 1
>>>
>>> Request is replied to:
>>> fuse_dev_do_write
>>>   fuse_request_end -> drops refcount on request
>>>
>>> Proper acquires on the request reference must be added to ensure that the
>>> timeout handler does not drop the last refcount on the request while
>>> other handlers may be operating on the request. Please note that the
>>> timeout handler may get invoked at any phase of the request's
>>> lifetime (eg before the request has been forwarded to userspace, etc).
>>>
>>> It is always guaranteed that there is a refcount on the request when the
>>> timeout handler is executing. The timeout handler will be either
>>> deactivated by the reply/abort/release handlers, or if the timeout
>>> handler is concurrently executing on another CPU, the reply/abort/release
>>> handlers will wait for the timeout handler to finish executing first before
>>> it drops the final refcount on the request.
>>>
>>> Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
>>> ---
>>>  fs/fuse/dev.c    | 187 +++++++++++++++++++++++++++++++++++++++++++++--
>>>  fs/fuse/fuse_i.h |  14 ++++
>>>  fs/fuse/inode.c  |   7 ++
>>>  3 files changed, 200 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
>>> index 9eb191b5c4de..9992bc5f4469 100644
>>> --- a/fs/fuse/dev.c
>>> +++ b/fs/fuse/dev.c
>>> @@ -31,6 +31,8 @@ MODULE_ALIAS("devname:fuse");
>>>
>>>  static struct kmem_cache *fuse_req_cachep;
>>>
>>> +static void fuse_request_timeout(struct timer_list *timer);
>>> +
>>>  static struct fuse_dev *fuse_get_dev(struct file *file)
>>>  {
>>>       /*
>>> @@ -48,6 +50,8 @@ static void fuse_request_init(struct fuse_mount *fm, struct fuse_req *req)
>>>       refcount_set(&req->count, 1);
>>>       __set_bit(FR_PENDING, &req->flags);
>>>       req->fm = fm;
>>> +     if (fm->fc->req_timeout)
>>> +             timer_setup(&req->timer, fuse_request_timeout, 0);
>>>  }
>>>
>>>  static struct fuse_req *fuse_request_alloc(struct fuse_mount *fm, gfp_t flags)
>>> @@ -277,12 +281,15 @@ static void flush_bg_queue(struct fuse_conn *fc)
>>>   * the 'end' callback is called if given, else the reference to the
>>>   * request is released
>>>   */
>>> -void fuse_request_end(struct fuse_req *req)
>>> +static void do_fuse_request_end(struct fuse_req *req, bool from_timer_callback)
>>>  {
>>>       struct fuse_mount *fm = req->fm;
>>>       struct fuse_conn *fc = fm->fc;
>>>       struct fuse_iqueue *fiq = &fc->iq;
>>>
>>> +     if (from_timer_callback)
>>> +             req->out.h.error = -ETIME;
>>> +
>>
>> FMHO, could we move the above error assignment up to the caller to make
>> do_fuse_request_end() look cleaner?
> 
> Sure, I was thinking that it looks cleaner setting this in
> do_fuse_request_end() instead of having to set it in both
> timeout_pending_req() and timeout_inflight_req(), but I see your point
> as well.
> I'll make this change in v3.
> 
>>
>>
>>>       if (test_and_set_bit(FR_FINISHED, &req->flags))
>>>               goto put_request;
>>>
>>> @@ -296,8 +303,6 @@ void fuse_request_end(struct fuse_req *req)
>>>               list_del_init(&req->intr_entry);
>>>               spin_unlock(&fiq->lock);
>>>       }
>>> -     WARN_ON(test_bit(FR_PENDING, &req->flags));
>>> -     WARN_ON(test_bit(FR_SENT, &req->flags));
>>>       if (test_bit(FR_BACKGROUND, &req->flags)) {
>>>               spin_lock(&fc->bg_lock);
>>>               clear_bit(FR_BACKGROUND, &req->flags);
>>> @@ -324,13 +329,105 @@ void fuse_request_end(struct fuse_req *req)
>>>               wake_up(&req->waitq);
>>>       }
>>>
>>> +     if (!from_timer_callback && req->timer.function)
>>> +             timer_delete_sync(&req->timer);
>>> +
>>
>> Similarly, move the caller i.e. fuse_request_end() call
>> timer_delete_sync() instead?
> 
> I don't think we can do that because the fuse_put_request() at the end
> of this function often holds the last refcount on the request which
> frees the request when it releases the ref.

Just a suggestion, maybe add an extra reference for the timer? The
condition above and fuse_request_timeout() would then need to release
that ref.

Thanks,
Bernd