Re: remove iomap_writepage v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 28, 2022 at 03:18:03PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 28, 2022 at 01:10:16PM +0200, Jan Kara wrote:
> > Hi Christoph!
> > 
> > On Tue 19-07-22 06:13:07, Christoph Hellwig wrote:
> > > this series removes iomap_writepage and it's callers, following what xfs
> > > has been doing for a long time.
> > 
> > So this effectively means "no writeback from page reclaim for these
> > filesystems" AFAICT (page migration of dirty pages seems to be handled by
> > iomap_migrate_page()) which is going to make life somewhat harder for
> > memory reclaim when memory pressure is high enough that dirty pages are
> > reaching end of the LRU list. I don't expect this to be a problem on big
> > machines but it could have some undesirable effects for small ones
> > (embedded, small VMs). I agree per-page writeback has been a bad idea for
> > efficiency reasons for at least last 10-15 years and most filesystems
> > stopped dealing with more complex situations (like block allocation) from
> > ->writepage() already quite a few years ago without any bug reports AFAIK.
> > So it all seems like a sensible idea from FS POV but are MM people on board
> > or at least aware of this movement in the fs land?
> 
> I mentioned it during my folio session at LSFMM, but didn't put a huge
> emphasis on it.
> 
> For XFS, writeback should already be in progress on other pages if
> we're getting to the point of trying to call ->writepage() in vmscan.
> Surely this is also true for other filesystems?

Yes.

It's definitely true for btrfs, too, because btrfs_writepage does:

static int btrfs_writepage(struct page *page, struct writeback_control *wbc)
{
        struct inode *inode = page->mapping->host;
        int ret;

        if (current->flags & PF_MEMALLOC) {
                redirty_page_for_writepage(wbc, page);
                unlock_page(page);
                return 0;
        }
....

It also rejects all calls to write dirty pages from memory reclaim
contexts.

ext4 will also reject writepage calls from memory allocation if
block allocation is required (due to delayed allocation) or
unwritten extents need converting to written. i.e. if it has to run
blocking transactions.

So all three major filesystems will either partially or wholly
reject ->writepage calls from memory reclaim context.

IOWs, if memory reclaim is depending on ->writepage() to make
reclaim progress, it's not working as advertised on the vast
majority of production Linux systems....

The reality is that ->writepage is a relic of a bygone era of OS and
filesystem design. It was useful in the days where writing a dirty
page just involved looking up the bufferhead attached to the page to
get the disk mapping and then submitting it for IO.

Those days are long gone - filesystems have complex IO submission
paths now that have to handle delayed allocation, copy-on-write,
unwritten extents, have unbound memory demand, etc. All the
filesystems that support these 1990s era filesystem technologies
simply turn off ->writepage in memory reclaim contexts.

Hence for the vast majority of linux users (i.e. everyone using
ext4, btrfs and XFS), ->writepage no longer plays any part in memory
reclaim on their systems.

So why should we try to maintain the fiction that ->writepage is
required functionality in a filesystem when it clearly isn't?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux