Re: [HELP] FUSE writeback performance bottleneck

Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> · Tue, 4 Jun 2024 20:24:06 +0800

On 6/4/24 5:32 PM, Bernd Schubert wrote:
> 
> 
> On 6/4/24 09:36, Jingbo Xu wrote:
>>
>>
>> On 6/4/24 3:27 PM, Miklos Szeredi wrote:
>>> On Tue, 4 Jun 2024 at 03:57, Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
>>>
>>>> IIUC, there are two sources that may cause deadlock:
>>>> 1) the fuse server needs memory allocation when processing FUSE_WRITE
>>>> requests, which in turn triggers direct memory reclaim, and FUSE
>>>> writeback then - deadlock here
>>>
>>> Yep, see the folio_wait_writeback() call deep in the guts of direct
>>> reclaim, which sleeps until the PG_writeback flag is cleared.  If that
>>> happens to be triggered by the writeback in question, then that's a
>>> deadlock.
>>>
>>>> 2) a process that trigfgers direct memory reclaim or calls sync(2) may
>>>> hang there forever, if the fuse server is buggyly or malicious and thus
>>>> hang there when processing FUSE_WRITE requests
>>>
>>> Ah, yes, sync(2) is also an interesting case.   We don't want unpriv
>>> fuse servers to be able to block sync(2), which means that sync(2)
>>> won't actually guarantee a synchronization of fuse's dirty pages.  I
>>> don't think there's even a theoretical solution to that, but
>>> apparently nobody cares...
>>
>> Okay if the temp page design is unavoidable, then I don't know if there
>> is any approach (in FUSE or VFS layer) helps page copy offloading.  At
>> least we don't want the writeback performance to be limited by the
>> single writeback kworker.  This is also the initial attempt of this thread.
>>
> 
> Offloading it to another thread is just a workaround, though maybe a
> temporary solution.

If we could break the limit that only one single (writeback) kworker for
one bdi... Apparently it's much more complicated.  Just a brainstorming
idea...

I agree it's a tough thing.

-- 
Thanks,
Jingbo