Re: [PATCH RESEND v9 2/3] fuse: add optional kernel-enforced timeout for requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/28/24 12:09, Sergey Senozhatsky wrote:
> [You don't often get email from senozhatsky@xxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> On (24/11/28 12:00), Bernd Schubert wrote:
>> On 11/28/24 11:44, Sergey Senozhatsky wrote:
>>> Hi Joanne,
>>>
>>> On (24/11/14 11:13), Joanne Koong wrote:
>>>> There are situations where fuse servers can become unresponsive or
>>>> stuck, for example if the server is deadlocked. Currently, there's no
>>>> good way to detect if a server is stuck and needs to be killed manually.
>>>>
>>>> This commit adds an option for enforcing a timeout (in minutes) for
>>>> requests where if the timeout elapses without the server responding to
>>>> the request, the connection will be automatically aborted.
>>>
>>> Does it make sense to configure timeout in seconds?  hung-task watchdog
>>> operates in seconds and can be set to anything, e.g. 45 seconds, so it
>>> panic the system before fuse timeout has a chance to trigger.
>>>
>>> Another question is: this will terminate the connection.  Does it
>>> make sense to run timeout per request and just "abort" individual
>>> requests?  What I'm currently playing with here on our side is
>>> something like this:
> 
> Thanks for the pointers again, Bernd.
> 
>> Miklos had asked for to abort the connection in v4
>> https://lore.kernel.org/all/CAJfpegsiRNnJx7OAoH58XRq3zujrcXx94S2JACFdgJJ_b8FdHw@xxxxxxxxxxxxxx/raw
> 
> OK, sounds reasonable. I'll try to give the series some testing in the
> coming days.
> 
> // I still would probably prefer "seconds" timeout granularity.
> // Unless this also has been discussed already and Bernd has a link ;)


The issue is that is currently iterating through 256 hash lists + 
pending + bg.

https://lore.kernel.org/all/CAJnrk1b7bfAWWq_pFP=4XH3ddc_9GtAM2mE7EgWnx2Od+UUUjQ@xxxxxxxxxxxxxx/raw


Personally I would prefer a second list to avoid the check spike and latency
https://lore.kernel.org/linux-fsdevel/9ba4eaf4-b9f0-483f-90e5-9512aded419e@xxxxxxxxxxx/raw

What is your opinion about that? I guess android and chromium have an
interest low latencies and avoiding cpu spikes?


Thanks,
Bernd






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux