On Tue, Feb 18, 2025 at 04:02:43PM +0100, Hannes Reinecke wrote: > On 2/17/25 22:58, Matthew Wilcox wrote: > > On Tue, Feb 04, 2025 at 03:12:05PM -0800, Luis Chamberlain wrote: > > > @@ -182,7 +182,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) > > > goto confused; > > > block_in_file = folio_pos(folio) >> blkbits; > > > - last_block = block_in_file + args->nr_pages * blocks_per_page; > > > + last_block = block_in_file + args->nr_pages * blocks_per_folio; > > > > In mpage_readahead(), we set args->nr_pages to the nunber of pages (not > > folios) being requested. In mpage_read_folio() we currently set it to > > 1. So this is going to read too far ahead for readahead if using large > > folios. > > > > I think we need to make nr_pages continue to mean nr_pages. Or we pass > > in nr_bytes or nr_blocks. > > > I had been pondering this, too, while developing the patch. > The idea I had here was to change counting by pages over to counting by > folios, as then the logic is essentially unchanged. > > Not a big fan of 'nr_pages', as then the question really is how much > data we should read at the end of the day. So I'd rather go with 'nr_blocks' > to avoid any confusion. I think the easier answer is to adjust nr_pages in terms of min-order requirements and fix last_block computation so we don't lie for large folios as follows. While at it, I noticed a folio_zero_segment() was missing folio_size(). diff --git a/fs/mpage.c b/fs/mpage.c index c17d7a724e4b..624bf30f0b2e 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -152,6 +152,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) { struct folio *folio = args->folio; struct inode *inode = folio->mapping->host; + const unsigned min_nrpages = mapping_min_folio_nrpages(folio->mapping); const unsigned blkbits = inode->i_blkbits; const unsigned blocks_per_folio = folio_size(folio) >> blkbits; const unsigned blocksize = 1 << blkbits; @@ -172,6 +173,8 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) /* MAX_BUF_PER_PAGE, for example */ VM_BUG_ON_FOLIO(folio_test_large(folio), folio); + VM_BUG_ON_FOLIO(args->nr_pages < min_nrpages, folio); + VM_BUG_ON_FOLIO(!IS_ALIGNED(args->nr_pages, min_nrpages), folio); if (args->is_readahead) { opf |= REQ_RAHEAD; @@ -182,7 +185,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) goto confused; block_in_file = folio_pos(folio) >> blkbits; - last_block = block_in_file + args->nr_pages * blocks_per_folio; + last_block = block_in_file + ((args->nr_pages * PAGE_SIZE) >> blkbits); last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits; if (last_block > last_block_in_file) last_block = last_block_in_file; @@ -269,7 +272,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) } if (first_hole != blocks_per_folio) { - folio_zero_segment(folio, first_hole << blkbits, PAGE_SIZE); + folio_zero_segment(folio, first_hole << blkbits, folio_size(folio)); if (first_hole == 0) { folio_mark_uptodate(folio); folio_unlock(folio); @@ -385,7 +388,7 @@ int mpage_read_folio(struct folio *folio, get_block_t get_block) { struct mpage_readpage_args args = { .folio = folio, - .nr_pages = 1, + .nr_pages = mapping_min_folio_nrpages(folio->mapping), .get_block = get_block, };