On Fri, Dec 6, 2024 at 2:25 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Fri, Dec 06, 2024 at 09:41:25AM -0800, Joanne Koong wrote: > > On Fri, Dec 6, 2024 at 1:50 AM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote: > > > - folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN, > > > + folio = __filemap_get_folio(mapping, index, FGP_WRITEBEGIN | > > > fgf_set_order(len), > > > > > > Otherwise the large folio is not enabled on the buffer write path. > > > > > > > > > Besides, when applying the above diff, the large folio is indeed enabled > > > but it suffers severe performance regression: > > > > > > fio 1 job buffer write: > > > 2GB/s BW w/o large folio, and 200MB/s BW w/ large folio > > > > This is the behavior I noticed as well when running some benchmarks on > > v1 [1]. I think it's because when we call into __filemap_get_folio(), > > we hit the FGP_CREAT path and if the order we set is too high, the > > internal call to filemap_alloc_folio() will repeatedly fail until it > > finds an order it's able to allocate (eg the do { ... } while (order-- > > > min_order) loop). > > But this is very different frrom what other filesystems have measured > when allocating large folios during writes. eg: > > https://lore.kernel.org/linux-fsdevel/20240527163616.1135968-1-hch@xxxxxx/ Ok, this seems like something particular to FUSE then, if all the other filesystems are seeing 2x throughput improvements for buffered writes. If someone doesn't get to this before me, I'll look deeper into this. Thanks, Joanne > > So we need to understand what's different about fuse. My suspicion is > that it's disabling some other optimisation that is only done on > order 0 folios, but that's just wild speculation. Needs someone to > dig into it and look at profiles to see what's really going on.