On 30.07.2019 18:48, Jan Kara wrote:
On Tue 30-07-19 17:57:18, Konstantin Khlebnikov wrote:
On 30.07.2019 17:14, Jan Kara wrote:
On Tue 23-07-19 11:16:51, Konstantin Khlebnikov wrote:
On 23.07.2019 3:52, Andrew Morton wrote:
(cc linux-fsdevel and Jan)
Thanks for CC Andrew.
On Mon, 22 Jul 2019 12:36:08 +0300 Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> wrote:
Functions like filemap_write_and_wait_range() should do nothing if inode
has no dirty pages or pages currently under writeback. But they anyway
construct struct writeback_control and this does some atomic operations
if CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
updates state of writeback ownership, on slow path might be more work.
Current this path is safely avoided only when inode mapping has no pages.
For example generic_file_read_iter() calls filemap_write_and_wait_range()
at each O_DIRECT read - pretty hot path.
Yes, but in common case mapping_needs_writeback() is false for files you do
direct IO to (exactly the case with no pages in the mapping). So you
shouldn't see the overhead at all. So which case you really care about?
This patch skips starting new writeback if mapping has no dirty tags set.
If writeback is already in progress filemap_write_and_wait_range() will
wait for it.
...
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -408,7 +408,8 @@ int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
.range_end = end,
};
- if (!mapping_cap_writeback_dirty(mapping))
+ if (!mapping_cap_writeback_dirty(mapping) ||
+ !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
return 0;
wbc_attach_fdatawrite_inode(&wbc, mapping->host);
How does this play with tagged_writepages? We assume that no tagging
has been performed by any __filemap_fdatawrite_range() caller?
Checking also PAGECACHE_TAG_TOWRITE is cheap but seems redundant.
To-write tags are supposed to be a subset of dirty tags:
to-write is set only when dirty is set and cleared after starting writeback.
Special case set_page_writeback_keepwrite() which does not clear to-write
should be for dirty page thus dirty tag is not going to be cleared either.
Ext4 calls it after redirty_page_for_writepage()
XFS even without clear_page_dirty_for_io()
Anyway to-write tag without dirty tag or at clear page is confusing.
Yeah, TOWRITE tag is intended to be internal to writepages logic so your
patch is fine in that regard. Overall the patch looks good to me so I'm
just wondering a bit about the motivation...
In our case file mixes cached pages and O_DIRECT read. Kind of database
were index header is memory mapped while the rest data read via O_DIRECT.
I suppose for sharing index between multiple instances.
OK, that has always been a bit problematic but you're not the first one to
have such design ;). So feel free to add:
Reviewed-by: Jan Kara <jack@xxxxxxx>
to your patch.
Thanks.
O_DIRECT has long history of misunderstandings =)
It looks some cases are still not documented.
My favourite: O_DIRECT write into hole goes into cache, at least for ext4.
On this path we also hit this bug:
https://lore.kernel.org/lkml/156355839560.2063.5265687291430814589.stgit@buzz/
so that's why I've started looking into this code.
I see. OK.
Honza