A call to file[map]_write_and_wait_range() with an end offset that precedes the start offset but happens to land in the same page can trigger writeback submission but fails to wait on the submitted page. Writeback submission occurs because __filemap_fdatawrite_range() passes both offsets down into write_cache_pages(), which rounds down to page indexes before it starts processing writeback. However, __filemap_fdatawait_range() immediately returns if the byte-granular end offset precedes the start offset. This behavior was observed in the form of unpredictable latency from a frequent write and wait call with incorrect parameters. The behavior gave the impression that the fdatawait path might occasionally fail to wait on writeback, but further investigation showed the latency was from write_cache_pages() waiting on writeback state to clear for a page already under writeback. Therefore, this indicated that fdatawait actually never waits on writeback in this particular situation. The byte granular check in __filemap_fdatawait_range() goes all the way back to the old wait_on_page_writeback() helper. It originally used page offsets and so would have waited in this problematic case. That changed to byte granularity file offsets in commit 94004ed726f3 ("kill wait_on_page_writeback_range"), which subtly changed this behavior. The check itself has become somewhat redundant since the error checking code that used to follow the wait loop (at the time of the aforementioned commit) has now been removed and lifted into the higher level callers. Therefore, we can restore historical fdatawait behavior by simply removing the check. Since the current fdatawait behavior has been in place for quite some time and is consistent with other interfaces that use file offsets, instead lift the check into the file[map]_write_and_wait_range() helpers to provide consistent behavior between the write and wait. Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> --- mm/filemap.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 08341616ae7a..e7711b5a3f4c 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -506,9 +506,6 @@ static void __filemap_fdatawait_range(struct address_space *mapping, struct pagevec pvec; int nr_pages; - if (end_byte < start_byte) - return; - pagevec_init(&pvec); while (index <= end) { unsigned i; @@ -670,6 +667,9 @@ int filemap_write_and_wait_range(struct address_space *mapping, { int err = 0, err2; + if (lend < lstart) + return 0; + if (mapping_needs_writeback(mapping)) { err = __filemap_fdatawrite_range(mapping, lstart, lend, WB_SYNC_ALL); @@ -770,6 +770,9 @@ int file_write_and_wait_range(struct file *file, loff_t lstart, loff_t lend) int err = 0, err2; struct address_space *mapping = file->f_mapping; + if (lend < lstart) + return 0; + if (mapping_needs_writeback(mapping)) { err = __filemap_fdatawrite_range(mapping, lstart, lend, WB_SYNC_ALL); -- 2.37.3