Re: [HELP] FUSE writeback performance bottleneck

Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> · Fri, 13 Sep 2024 09:25:13 +0800

On 9/13/24 8:00 AM, Joanne Koong wrote:
> On Thu, Aug 22, 2024 at 8:34 PM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
>>
>> On 6/4/24 6:02 PM, Miklos Szeredi wrote:
>>> On Tue, 4 Jun 2024 at 11:32, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:
>>>
>>>> Back to the background for the copy, so it copies pages to avoid
>>>> blocking on memory reclaim. With that allocation it in fact increases
>>>> memory pressure even more. Isn't the right solution to mark those pages
>>>> as not reclaimable and to avoid blocking on it? Which is what the tmp
>>>> pages do, just not in beautiful way.
>>>
>>> Copying to the tmp page is the same as marking the pages as
>>> non-reclaimable and non-syncable.
>>>
>>> Conceptually it would be nice to only copy when there's something
>>> actually waiting for writeback on the page.
>>>
>>> Note: normally the WRITE request would be copied to userspace along
>>> with the contents of the pages very soon after starting writeback.
>>> After this the contents of the page no longer matter, and we can just
>>> clear writeback without doing the copy.
>>
>> OK this really deviates from my previous understanding of the deadlock
>> issue.  Previously I thought *after* the server has received the WRITE
>> request, i.e. has copied the request and page content to userspace, the
>> server needs to allocate some memory to handle the WRITE request, e.g.
>> make the data persistent on disk, or send the data to the remote
>> storage.  It is the memory allocation at this point that actually
>> triggers a memory direct reclaim (on the FUSE dirty page) and causes a
>> deadlock.  It seems that I misunderstand it.
> 
> I think your previous understanding is correct (or if not, then my
> understanding of this is incorrect too lol).
> The first write request makes it to userspace and when the server is
> in the middle of handling it, a memory reclaim is triggered where
> pages need to be written back. This leads to a SECOND write request
> (eg writing back the pages that are reclaimed) but this second write
> request will never be copied out to userspace because the server is
> stuck handling the first write request and waiting for the page
> reclaim bits of the reclaimed pages to be unset, but those reclaim
> bits can only be unset when the pages have been copied out to
> userspace, which only happens when the server reads /dev/fuse for the
> next request.

Right, that's true.

> 
>>
>> If that's true, we can clear PF_writeback as long as the whole request
>> along with the page content has already been copied to userspace, and
>> thus eliminate the tmp page copying.
>>
> 
> I think the problem is that on a single-threaded server,  the pages
> will not be copied out to userspace for the second request (aka
> writing back the dirty reclaimed pages) since the server is stuck on
> the first request.

Agreed.

-- 
Thanks,
Jingbo