On Thu, Jul 28, 2022 at 03:18:03PM +0100, Matthew Wilcox wrote: > On Thu, Jul 28, 2022 at 01:10:16PM +0200, Jan Kara wrote: > > Hi Christoph! > > > > On Tue 19-07-22 06:13:07, Christoph Hellwig wrote: > > > this series removes iomap_writepage and it's callers, following what xfs > > > has been doing for a long time. > > > > So this effectively means "no writeback from page reclaim for these > > filesystems" AFAICT (page migration of dirty pages seems to be handled by > > iomap_migrate_page()) which is going to make life somewhat harder for > > memory reclaim when memory pressure is high enough that dirty pages are > > reaching end of the LRU list. I don't expect this to be a problem on big > > machines but it could have some undesirable effects for small ones > > (embedded, small VMs). I agree per-page writeback has been a bad idea for > > efficiency reasons for at least last 10-15 years and most filesystems > > stopped dealing with more complex situations (like block allocation) from > > ->writepage() already quite a few years ago without any bug reports AFAIK. > > So it all seems like a sensible idea from FS POV but are MM people on board > > or at least aware of this movement in the fs land? > > I mentioned it during my folio session at LSFMM, but didn't put a huge > emphasis on it. > > For XFS, writeback should already be in progress on other pages if > we're getting to the point of trying to call ->writepage() in vmscan. > Surely this is also true for other filesystems? Yes. It's definitely true for btrfs, too, because btrfs_writepage does: static int btrfs_writepage(struct page *page, struct writeback_control *wbc) { struct inode *inode = page->mapping->host; int ret; if (current->flags & PF_MEMALLOC) { redirty_page_for_writepage(wbc, page); unlock_page(page); return 0; } .... It also rejects all calls to write dirty pages from memory reclaim contexts. ext4 will also reject writepage calls from memory allocation if block allocation is required (due to delayed allocation) or unwritten extents need converting to written. i.e. if it has to run blocking transactions. So all three major filesystems will either partially or wholly reject ->writepage calls from memory reclaim context. IOWs, if memory reclaim is depending on ->writepage() to make reclaim progress, it's not working as advertised on the vast majority of production Linux systems.... The reality is that ->writepage is a relic of a bygone era of OS and filesystem design. It was useful in the days where writing a dirty page just involved looking up the bufferhead attached to the page to get the disk mapping and then submitting it for IO. Those days are long gone - filesystems have complex IO submission paths now that have to handle delayed allocation, copy-on-write, unwritten extents, have unbound memory demand, etc. All the filesystems that support these 1990s era filesystem technologies simply turn off ->writepage in memory reclaim contexts. Hence for the vast majority of linux users (i.e. everyone using ext4, btrfs and XFS), ->writepage no longer plays any part in memory reclaim on their systems. So why should we try to maintain the fiction that ->writepage is required functionality in a filesystem when it clearly isn't? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx