Re: [PATCH v2 0/2] fuse: add timeout option for requests

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Tue, 6 Aug 2024 22:08:34 +0200

On 8/6/24 20:37, Joanne Koong wrote:
> On Tue, Aug 6, 2024 at 11:26 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
>>
>> On Tue, Aug 6, 2024 at 10:11 AM Bernd Schubert
>> <bernd.schubert@xxxxxxxxxxx> wrote:
>>>
>>> On 8/6/24 18:23, Joanne Koong wrote:
>>>
>>>>>
>>>>> This is very interesting. These logs (and the ones above with the
>>>>> lxcfs server running concurrently) are showing that the read request
>>>>> was freed but not through the do_fuse_request_end path. It's weird
>>>>> that fuse_simple_request reached fuse_put_request without
>>>>> do_fuse_request_end having been called (which is the only place where
>>>>> FR_FINISHED gets set and wakes up the wait events in
>>>>> request_wait_answer).
>>>>>
>>>>> I'll take a deeper look tomorrow and try to make more sense of it.
>>>>
>>>> Finally realized what's happening!
>>>> When we kill the cat program, if the request hasn't been sent out to
>>>> userspace yet when the fatal signal interrupts the
>>>> wait_event_interruptible and wait_event_killable in
>>>> request_wait_answer(), this will clean up the request manually (not
>>>> through the fuse_request_end() path), which doesn't delete the timer.
>>>>
>>>> I'll fix this for v3.
>>>>
>>>> Thank you for surfacing this and it would be much appreciated if you
>>>> could test out v3 when it's submitted to make sure.
>>>
>>> It is still just a suggestion, but if the timer would have its own ref,
>>> any oversight of another fuse_put_request wouldn't be fatal.
>>>
>>
>> Thanks for the suggestion. My main concerns are whether it's worth the
>> extra (minimal?) performance penalty for something that's not strictly
>> needed and whether it ends up adding more of a burden to keep track of
>> the timer ref (eg in error handling like the case above where the
>> fatal signal is for a request that hasn't been sent to userspace yet,
>> having to account for the extra timer ref if the timer callback didn't
>> execute). I don't think adding a timer ref would prevent fatal crashes
>> on fuse_put_request oversights (unless we also mess up not releasing a
>> corresponding timer ref  :))
> 
> I amend this last sentence - I just realized your point about the
> fatal crashes is that if we accidentally miss a fuse_put_request
> altogether, we'd also miss releasing the timer ref in that path, which
> means the timer callback would be the one releasing the last ref.
> 

Yeah, that is what I meant. It is a bit defensive coding, but I don't
have a strong opinion about it.

Thanks,
Bernd