Currently it calls pagevec_lookup_range_nr_tag(), but Willy pointed out that that is probably inefficient, as we might end up having to search several times if we get down to looking for one more page to fill a write. "I think ceph is misusing pagevec_lookup_range_nr_tag(). Let's suppose you get a range which is AAAAbbbbAAAAbbbbAAAAbbbbbbbb(...)bbbbAAAA and you try to fetch max_pages=13. First loop will get AAAAbbbbAAAAb and have 8 locked_pages. The next call will get bbbAA and now locked_pages=10. Next call gets AAb ... and now you're iterating your way through all the 'b' one page at a time until you find that first A." 'A' here refers to pages that are eligible for writeback and 'b' represents ones that aren't (for whatever reason). Ceph is also the only caller of pagevec_lookup_range_nr_tag(), so changing this code to use pagevec_lookup_range_tag() should allow us to eliminate that call as well. That may mean that we sometimes find more pages than are needed, but the extra references will just get put at the end regardless. Reported-by: Matthew Wilcox <willy@xxxxxxxxxxxxx> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> --- fs/ceph/addr.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) I'm still testing this, but it looks good so far. If it's OK, we'll get this in for v5.10, and then I'll send a patch to remove pagevec_lookup_range_nr_tag. diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 6ea761c84494..b03dbaa9d345 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -962,9 +962,8 @@ static int ceph_writepages_start(struct address_space *mapping, max_pages = wsize >> PAGE_SHIFT; get_more_pages: - pvec_pages = pagevec_lookup_range_nr_tag(&pvec, mapping, &index, - end, PAGECACHE_TAG_DIRTY, - max_pages - locked_pages); + pvec_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, + end, PAGECACHE_TAG_DIRTY); dout("pagevec_lookup_range_tag got %d\n", pvec_pages); if (!pvec_pages && !locked_pages) break; -- 2.26.2