Re: [HELP] FUSE writeback performance bottleneck

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Tue, 4 Jun 2024 16:13:25 +0200

On 6/4/24 12:02, Miklos Szeredi wrote:
> On Tue, 4 Jun 2024 at 11:32, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:
> 
>> Back to the background for the copy, so it copies pages to avoid
>> blocking on memory reclaim. With that allocation it in fact increases
>> memory pressure even more. Isn't the right solution to mark those pages
>> as not reclaimable and to avoid blocking on it? Which is what the tmp
>> pages do, just not in beautiful way.
> 
> Copying to the tmp page is the same as marking the pages as
> non-reclaimable and non-syncable.
> 
> Conceptually it would be nice to only copy when there's something
> actually waiting for writeback on the page.
> 
> Note: normally the WRITE request would be copied to userspace along
> with the contents of the pages very soon after starting writeback.
> After this the contents of the page no longer matter, and we can just
> clear writeback without doing the copy.
> 
> But if the request gets stuck in the input queue before being copied
> to userspace, then deadlock can still happen if the server blocks on
> direct reclaim and won't continue with processing the queue.   And
> sync(2) will also block in that case.>
> So we'd somehow need to handle stuck WRITE requests.   I don't see an
> easy way to do this "on demand", when something actually starts
> waiting on PG_writeback.  Alternatively the page copy could be done
> after a timeout, which is ugly, but much easier to implement.

I think the timeout method would only work if we have already allocated
the pages, under memory pressure page allocation might not work well.
But then this still seems to be a workaround, because we don't take any
less memory with these copied pages.
I'm going to look into mm/ if there isn't a better solution.

> 
> Also splice from the fuse dev would need to copy those pages, but that
> shouldn't be a problem, since it's just moving the copy from one place
> to another.

Ok, at least I need to keep an eye on it that it doesn't break when I
write a patch.

Thanks,
Bernd