On Fri, Sep 27, 2024 at 1:45 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote: > > Currently we're using the __readahead_batch() helper which populates our > fuse_args_pages->pages array with pages. Convert this to use the newer > folio based pattern which is to call readahead_folio() to get the next > folio in the read ahead batch. I've updated the code to use things like > folio_size() and to take into account larger folio sizes, but this is > purely to make that eventual work easier to do, we currently will not > get large folios so this is more future proofing than actual support. > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx> > --- > fs/fuse/file.c | 43 ++++++++++++++++++++++++++++--------------- > 1 file changed, 28 insertions(+), 15 deletions(-) > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index f33fbce86ae0..132528cde745 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -938,7 +938,6 @@ static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args, > struct folio *folio = page_folio(ap->pages[i]); > > folio_end_read(folio, !err); > - folio_put(folio); > } > if (ia->ff) > fuse_file_put(ia->ff, false); > @@ -985,18 +984,36 @@ static void fuse_send_readpages(struct fuse_io_args *ia, struct file *file) > static void fuse_readahead(struct readahead_control *rac) > { > struct inode *inode = rac->mapping->host; > + struct fuse_inode *fi = get_fuse_inode(inode); > struct fuse_conn *fc = get_fuse_conn(inode); > - unsigned int i, max_pages, nr_pages = 0; > + unsigned int max_pages, nr_pages; > + pgoff_t first = readahead_index(rac); > + pgoff_t last = first + readahead_count(rac) - 1; > > if (fuse_is_bad(inode)) > return; > > + wait_event(fi->page_waitq, !fuse_range_is_writeback(inode, first, last)); Should this line be moved to after we check the readahead count? eg nr_pages = readahead_count(rac); if (!nr_pages) return; wait_event(fi->page_waitq, !fuse_range_is_writeback(inode, first, last)); Otherwise I think in that case you mentioned where read_pages() calls into readahead_folio() after it's consumed the last folio, we end up calling this wait_event > + > max_pages = min_t(unsigned int, fc->max_pages, > fc->max_read / PAGE_SIZE); > > - for (;;) { > + /* > + * This is only accurate the first time through, since readahead_folio() > + * doesn't update readahead_count() from the previous folio until the > + * next call. Grab nr_pages here so we know how many pages we're going > + * to have to process. This means that we will exit here with > + * readahead_count() == folio_nr_pages(last_folio), but we will have > + * consumed all of the folios, and read_pages() will call > + * readahead_folio() again which will clean up the rac. > + */ > + nr_pages = readahead_count(rac); > + > + while (nr_pages) { > struct fuse_io_args *ia; > struct fuse_args_pages *ap; > + struct folio *folio; > + unsigned cur_pages = min(max_pages, nr_pages); > > if (fc->num_background >= fc->congestion_threshold && > rac->ra->async_size >= readahead_count(rac)) > @@ -1006,23 +1023,19 @@ static void fuse_readahead(struct readahead_control *rac) > */ > break; > > - nr_pages = readahead_count(rac) - nr_pages; > - if (nr_pages > max_pages) > - nr_pages = max_pages; > - if (nr_pages == 0) > - break; > - ia = fuse_io_alloc(NULL, nr_pages); > + ia = fuse_io_alloc(NULL, cur_pages); > if (!ia) > return; > ap = &ia->ap; > - nr_pages = __readahead_batch(rac, ap->pages, nr_pages); > - for (i = 0; i < nr_pages; i++) { > - fuse_wait_on_page_writeback(inode, > - readahead_index(rac) + i); > - ap->descs[i].length = PAGE_SIZE; > + > + while (ap->num_pages < cur_pages && > + (folio = readahead_folio(rac)) != NULL) { > + ap->pages[ap->num_pages] = &folio->page; > + ap->descs[ap->num_pages].length = folio_size(folio); > + ap->num_pages++; > } > - ap->num_pages = nr_pages; > fuse_send_readpages(ia, rac->file); > + nr_pages -= cur_pages; > } > } > > -- > 2.43.0 > >