On Mon, Dec 9, 2024 at 9:07 AM Malte Schröder <malte.schroeder@xxxxxxxx> wrote: > > On 09/12/2024 16:48, Josef Bacik wrote: > > On Mon, Dec 09, 2024 at 03:28:14PM +0000, Matthew Wilcox wrote: > >> On Mon, Dec 09, 2024 at 09:49:48AM -0500, Josef Bacik wrote: > >>>> Ha! This time I bisected from f03b296e8b51 to d1dfb5f52ffc. I ended up > >>>> with 3b97c3652d91 as the culprit. > >>> Willy, I've looked at this code and it does indeed look like a 1:1 conversion, > >>> EXCEPT I'm fuzzy about how how this works with large folios. Previously, if we > >>> got a hugepage in, we'd get each individual struct page back for the whole range > >>> of the hugepage, so if for example we had a 2M hugepage, we'd fill in the > >>> ->offset for each "middle" struct page as 0, since obviously we're consuming > >>> PAGE_SIZE chunks at a time. > >>> > >>> But now we're doing this > >>> > >>> for (i = 0; i < nfolios; i++) > >>> ap->folios[i + ap->num_folios] = page_folio(pages[i]); > >>> > >>> So if userspace handed us a 2M hugepage, page_folio() on each of the > >>> intermediary struct page's would return the same folio, correct? So we'd end up > >>> with the wrong offsets for our fuse request, because they should be based from > >>> the start of the folio, correct? > >> I think you're 100% right. We could put in some nice asserts to check > >> this is what's happening, but it does seem like a rather incautious > >> conversion. Yes, all folios _in the page cache_ for fuse are small, but > >> that's not guaranteed to be the case for folios found in userspace for > >> directio. At least the comment is wrong, and I'd suggest the code is too. > > Ok cool, Malte can you try the attached only compile tested patch and see if the > > problem goes away? Thanks, > > > > Josef > > > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > > index 88d0946b5bc9..c4b93ead99a5 100644 > > --- a/fs/fuse/file.c > > +++ b/fs/fuse/file.c > > @@ -1562,9 +1562,19 @@ static int fuse_get_user_pages(struct fuse_args_pages *ap, struct iov_iter *ii, > > nfolios = DIV_ROUND_UP(ret, PAGE_SIZE); > > > > ap->descs[ap->num_folios].offset = start; > > - fuse_folio_descs_length_init(ap->descs, ap->num_folios, nfolios); > > - for (i = 0; i < nfolios; i++) > > - ap->folios[i + ap->num_folios] = page_folio(pages[i]); > > + for (i = 0; i < nfolios; i++) { > > + struct folio *folio = page_folio(pages[i]); > > + unsigned int offset = start + > > + (folio_page_idx(folio, pages[i]) << PAGE_SHIFT); > > + unsigned int len = min_t(unsigned int, ret, folio_size(folio) - offset); > > + > > + len = min_t(unsigned int, len, PAGE_SIZE); > > + > > + ap->descs[ap->num_folios + i].offset = offset; > > + ap->descs[ap->num_folios + i].length = len; > > + ap->folios[i + ap->num_folios] = folio; > > + start = 0; > > + } > > > > ap->num_folios += nfolios; > > ap->descs[ap->num_folios - 1].length -= > > The problem persists with this patch. > Catching up on this thread now. I'll investigate this today. > > /Malte >