Re: [PATCH v2 00/12] fuse: support large folios

Joanne Koong <joannelkoong@xxxxxxxxx> · Mon, 9 Dec 2024 16:31:08 -0800

On Fri, Dec 6, 2024 at 2:25 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Fri, Dec 06, 2024 at 09:41:25AM -0800, Joanne Koong wrote:
> > On Fri, Dec 6, 2024 at 1:50 AM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
> > > -       folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN,
> > > +       folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN |
> > > fgf_set_order(len),
> > >
> > > Otherwise the large folio is not enabled on the buffer write path.
> > >
> > >
> > > Besides, when applying the above diff, the large folio is indeed enabled
> > > but it suffers severe performance regression:
> > >
> > > fio 1 job buffer write:
> > > 2GB/s BW w/o large folio, and 200MB/s BW w/ large folio
> >
> > This is the behavior I noticed as well when running some benchmarks on
> > v1 [1]. I think it's because when we call into __filemap_get_folio(),
> > we hit the FGP_CREAT path and if the order we set is too high, the
> > internal call to filemap_alloc_folio() will repeatedly fail until it
> > finds an order it's able to allocate (eg the do { ... } while (order--
> > > min_order) loop).
>
> But this is very different frrom what other filesystems have measured
> when allocating large folios during writes.  eg:
>
> https://lore.kernel.org/linux-fsdevel/20240527163616.1135968-1-hch@xxxxxx/

Ok, this seems like something particular to FUSE then, if all the
other filesystems are seeing 2x throughput improvements for buffered
writes. If someone doesn't get to this before me, I'll look deeper
into this.

Thanks,
Joanne
>
> So we need to understand what's different about fuse.  My suspicion is
> that it's disabling some other optimisation that is only done on
> order 0 folios, but that's just wild speculation.  Needs someone to
> dig into it and look at profiles to see what's really going on.