Re: [PATCH 1/2] mm/filemap: don't initiate writeback if mapping has no dirty pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 30-07-19 17:57:18, Konstantin Khlebnikov wrote:
> On 30.07.2019 17:14, Jan Kara wrote:
> > On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote:
> > > On 23.07.2019 3:52, Andrew Morton wrote:
> > > > 
> > > > (cc linux-fsdevel and Jan)
> > 
> > Thanks for CC Andrew.
> > 
> > > > On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> wrote:
> > > > 
> > > > > Functions like filemap_write_and_wait_range() should do nothing if inode
> > > > > has no dirty pages or pages currently under writeback. But they anyway
> > > > > construct struct writeback_control and this does some atomic operations
> > > > > if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
> > > > > updates state of writeback ownership, on slow path might be more work.
> > > > > Current this path is safely avoided only when inode mapping has no pages.
> > > > > 
> > > > > For example generic_file_read_iter() calls filemap_write_and_wait_range()
> > > > > at each O_DIRECT read - pretty hot path.
> > 
> > Yes, but in common case mapping_needs_writeback() is false for files you do
> > direct IO to (exactly the case with no pages in the mapping). So you
> > shouldn't see the overhead at all. So which case you really care about?
> > 
> > > > > This patch skips starting new writeback if mapping has no dirty tags set.
> > > > > If writeback is already in progress filemap_write_and_wait_range() will
> > > > > wait for it.
> > > > > 
> > > > > ...
> > > > > 
> > > > > --- a/mm/filemap.c
> > > > > +++ b/mm/filemap.c
> > > > > @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
> > > > >    		.range_end = end,
> > > > >    	};
> > > > > -	if (!mapping_cap_writeback_dirty(mapping))
> > > > > +	if (!mapping_cap_writeback_dirty(mapping) ||
> > > > > +	    !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> > > > >    		return 0;
> > > > >    	wbc_attach_fdatawrite_inode(&wbc, mapping->host);
> > > > 
> > > > How does this play with tagged_writepages?  We assume that no tagging
> > > > has been performed by any __filemap_fdatawrite_range() caller?
> > > > 
> > > 
> > > Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.
> > > 
> > > To-write tags are supposed to be a subset of dirty tags:
> > > to-write is set only when dirty is set and cleared after starting writeback.
> > > 
> > > Special case set_page_writeback_keepwrite() which does not clear to-write
> > > should be for dirty page thus dirty tag is not going to be cleared either.
> > > Ext4 calls it after redirty_page_for_writepage()
> > > XFS even without clear_page_dirty_for_io()
> > > 
> > > Anyway to-write tag without dirty tag or at clear page is confusing.
> > 
> > Yeah, TOWRITE tag is intended to be internal to writepages logic so your
> > patch is fine in that regard. Overall the patch looks good to me so I'm
> > just wondering a bit about the motivation...
> 
> In our case file mixes cached pages and O_DIRECT read. Kind of database
> were index header is memory mapped while the rest data read via O_DIRECT.
> I suppose for sharing index between multiple instances.

OK, that has always been a bit problematic but you're not the first one to
have such design ;). So feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

to your patch.

> On this path we also hit this bug:
> https://lore.kernel.org/lkml/156355839560.2063.5265687291430814589.stgit@buzz/
> so that's why I've started looking into this code.

I see. OK.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux