On Mon, Jun 05, 2023 at 04:25:05PM +0530, Ritesh Harjani (IBM) wrote: > We dont need to allocate an iop in ->write_begin() for writes where the > position and length completely overlap with the given folio. > Therefore, such cases are skipped. > > Currently when the folio is uptodate, we only allocate iop at writeback > time (in iomap_writepage_map()). This is ok until now, but when we are > going to add support for per-block dirty state bitmap in iop, this > could cause some performance degradation. The reason is that if we don't > allocate iop during ->write_begin(), then we will never mark the > necessary dirty bits in ->write_end() call. And we will have to mark all > the bits as dirty at the writeback time, that could cause the same write > amplification and performance problems as it is now. > > Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx> Makes sense to me, but moving on to the next patch... Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> --D > --- > fs/iomap/buffered-io.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index f55a339f99ec..2a97d73edb96 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -571,15 +571,24 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, > size_t from = offset_in_folio(folio, pos), to = from + len; > size_t poff, plen; > > - if (folio_test_uptodate(folio)) > + /* > + * If the write completely overlaps the current folio, then > + * entire folio will be dirtied so there is no need for > + * per-block state tracking structures to be attached to this folio. > + */ > + if (pos <= folio_pos(folio) && > + pos + len >= folio_pos(folio) + folio_size(folio)) > return 0; > - folio_clear_error(folio); > > iop = iomap_iop_alloc(iter->inode, folio, iter->flags); > > if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) > return -EAGAIN; > > + if (folio_test_uptodate(folio)) > + return 0; > + folio_clear_error(folio); > + > do { > iomap_adjust_read_range(iter->inode, folio, &block_start, > block_end - block_start, &poff, &plen); > -- > 2.40.1 >