Earlier when the folio is uptodate, we only allocate iop at writeback time (in iomap_writepage_map()). This is ok until now, but when we are going to add support for subpage size dirty bitmap tracking in iop, this could cause some performance degradation. The reason is that if we don't allocate iop during ->write_begin(), then we will never mark the necessary dirty bits in ->write_end() call. And we will have to mark all the bits as dirty at the writeback time, that could cause the same write amplification and performance problems as it is now (w/o subpage dirty bitmap tracking in iop). However, for all the writes with (pos, len) which completely overlaps the given folio, there is no need to allocate an iop during ->write_begin(). So skip those cases. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx> --- fs/iomap/buffered-io.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 6f4c97a6d7e9..e43821bd1ff5 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -562,14 +562,31 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t from = offset_in_folio(folio, pos), to = from + len; size_t poff, plen; - if (folio_test_uptodate(folio)) + /* + * If the write completely overlaps the current folio, then + * entire folio will be dirtied so there is no need for + * sub-folio state tracking structures to be attached to this folio. + */ + + if (pos <= folio_pos(folio) && + pos + len >= folio_pos(folio) + folio_size(folio)) return 0; - folio_clear_error(folio); iop = iomap_page_create(iter->inode, folio, iter->flags); + + /* + * If we don't have an iop and nr_blocks > 1 then return -EAGAIN here + * even though the folio may be uptodate. To ensure we add sub-folio + * state tracking structures to this folio. + */ if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) return -EAGAIN; + if (folio_test_uptodate(folio)) + return 0; + folio_clear_error(folio); + + do { iomap_adjust_read_range(iter->inode, folio, &block_start, block_end - block_start, &poff, &plen); -- 2.39.2