On 6/4/24 12:02, Miklos Szeredi wrote: > On Tue, 4 Jun 2024 at 11:32, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote: > >> Back to the background for the copy, so it copies pages to avoid >> blocking on memory reclaim. With that allocation it in fact increases >> memory pressure even more. Isn't the right solution to mark those pages >> as not reclaimable and to avoid blocking on it? Which is what the tmp >> pages do, just not in beautiful way. > > Copying to the tmp page is the same as marking the pages as > non-reclaimable and non-syncable. > > Conceptually it would be nice to only copy when there's something > actually waiting for writeback on the page. > > Note: normally the WRITE request would be copied to userspace along > with the contents of the pages very soon after starting writeback. > After this the contents of the page no longer matter, and we can just > clear writeback without doing the copy. > > But if the request gets stuck in the input queue before being copied > to userspace, then deadlock can still happen if the server blocks on > direct reclaim and won't continue with processing the queue. And > sync(2) will also block in that case.> > So we'd somehow need to handle stuck WRITE requests. I don't see an > easy way to do this "on demand", when something actually starts > waiting on PG_writeback. Alternatively the page copy could be done > after a timeout, which is ugly, but much easier to implement. I think the timeout method would only work if we have already allocated the pages, under memory pressure page allocation might not work well. But then this still seems to be a workaround, because we don't take any less memory with these copied pages. I'm going to look into mm/ if there isn't a better solution. > > Also splice from the fuse dev would need to copy those pages, but that > shouldn't be a problem, since it's just moving the copy from one place > to another. Ok, at least I need to keep an eye on it that it doesn't break when I write a patch. Thanks, Bernd