On Tue 30-07-19 17:57:18, Konstantin Khlebnikov wrote: > On 30.07.2019 17:14, Jan Kara wrote: > > On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote: > > > On 23.07.2019 3:52, Andrew Morton wrote: > > > > > > > > (cc linux-fsdevel and Jan) > > > > Thanks for CC Andrew. > > > > > > On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> wrote: > > > > > > > > > Functions like filemap_write_and_wait_range() should do nothing if inode > > > > > has no dirty pages or pages currently under writeback. But they anyway > > > > > construct struct writeback_control and this does some atomic operations > > > > > if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and > > > > > updates state of writeback ownership, on slow path might be more work. > > > > > Current this path is safely avoided only when inode mapping has no pages. > > > > > > > > > > For example generic_file_read_iter() calls filemap_write_and_wait_range() > > > > > at each O_DIRECT read - pretty hot path. > > > > Yes, but in common case mapping_needs_writeback() is false for files you do > > direct IO to (exactly the case with no pages in the mapping). So you > > shouldn't see the overhead at all. So which case you really care about? > > > > > > > This patch skips starting new writeback if mapping has no dirty tags set. > > > > > If writeback is already in progress filemap_write_and_wait_range() will > > > > > wait for it. > > > > > > > > > > ... > > > > > > > > > > --- a/mm/filemap.c > > > > > +++ b/mm/filemap.c > > > > > @@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start, > > > > > .range_end = end, > > > > > }; > > > > > - if (!mapping_cap_writeback_dirty(mapping)) > > > > > + if (!mapping_cap_writeback_dirty(mapping) || > > > > > + !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) > > > > > return 0; > > > > > wbc_attach_fdatawrite_inode(&wbc, mapping->host); > > > > > > > > How does this play with tagged_writepages? We assume that no tagging > > > > has been performed by any __filemap_fdatawrite_range() caller? > > > > > > > > > > Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant. > > > > > > To-write tags are supposed to be a subset of dirty tags: > > > to-write is set only when dirty is set and cleared after starting writeback. > > > > > > Special case set_page_writeback_keepwrite() which does not clear to-write > > > should be for dirty page thus dirty tag is not going to be cleared either. > > > Ext4 calls it after redirty_page_for_writepage() > > > XFS even without clear_page_dirty_for_io() > > > > > > Anyway to-write tag without dirty tag or at clear page is confusing. > > > > Yeah, TOWRITE tag is intended to be internal to writepages logic so your > > patch is fine in that regard. Overall the patch looks good to me so I'm > > just wondering a bit about the motivation... > > In our case file mixes cached pages and O_DIRECT read. Kind of database > were index header is memory mapped while the rest data read via O_DIRECT. > I suppose for sharing index between multiple instances. OK, that has always been a bit problematic but you're not the first one to have such design ;). So feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> to your patch. > On this path we also hit this bug: > https://lore.kernel.org/lkml/156355839560.2063.5265687291430814589.stgit@buzz/ > so that's why I've started looking into this code. I see. OK. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR