On Tue, Jun 4, 2024 at 3:02 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > On Tue, 4 Jun 2024 at 11:32, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote: > > > Back to the background for the copy, so it copies pages to avoid > > blocking on memory reclaim. With that allocation it in fact increases > > memory pressure even more. Isn't the right solution to mark those pages > > as not reclaimable and to avoid blocking on it? Which is what the tmp > > pages do, just not in beautiful way. > > Copying to the tmp page is the same as marking the pages as > non-reclaimable and non-syncable. > > Conceptually it would be nice to only copy when there's something > actually waiting for writeback on the page. > > Note: normally the WRITE request would be copied to userspace along > with the contents of the pages very soon after starting writeback. > After this the contents of the page no longer matter, and we can just > clear writeback without doing the copy. > > But if the request gets stuck in the input queue before being copied > to userspace, then deadlock can still happen if the server blocks on > direct reclaim and won't continue with processing the queue. And > sync(2) will also block in that case. Why doesn't it suffice to just check if the page is being reclaimed and do the tmp page allocation only if it's under reclaim? > > So we'd somehow need to handle stuck WRITE requests. I don't see an > easy way to do this "on demand", when something actually starts > waiting on PG_writeback. Alternatively the page copy could be done > after a timeout, which is ugly, but much easier to implement. > > Also splice from the fuse dev would need to copy those pages, but that > shouldn't be a problem, since it's just moving the copy from one place > to another. > > Thanks, > Miklos