On Fri, Dec 13, 2024 at 2:23 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > This patchset adds support for folios larger than one page size in FUSE. > > This patchset is rebased on top of the (unmerged) patchset that removes temp > folios in writeback [1]. This patchset was tested by running it through fstests > on passthrough_hp. > > Please note that writes are still effectively one page size. Larger writes can > be enabled by setting the order on the fgp flag passed in to __filemap_get_folio() > but benchmarks show this significantly degrades performance. More investigation > needs to be done into this. As such, buffered writes will be optimized in a > future patchset. > > Benchmarks show roughly a ~45% improvement in read throughput. > > Benchmark setup: > > -- Set up server -- > ./libfuse/build/example/passthrough_hp --bypass-rw=1 ~/libfuse > ~/mounts/fuse/ --nopassthrough > (using libfuse patched with https://github.com/libfuse/libfuse/pull/807) > > -- Run fio -- > fio --name=read --ioengine=sync --rw=read --bs=1M --size=1G > --numjobs=2 --ramp_time=30 --group_reporting=1 > --directory=mounts/fuse/ > > Machine 1: > No large folios: ~4400 MiB/s > Large folios: ~7100 MiB/s > > Machine 2: > No large folios: ~3700 MiB/s > Large folios: ~6400 MiB/s > > > [1] https://lore.kernel.org/linux-fsdevel/20241122232359.429647-1-joannelkoong@xxxxxxxxx/ > A couple of updates on this: * I'm going to remove the writeback patch (patch 11/12) in this series and resubmit, and leave large folios writeback to be done as a separate future patchset. Getting writeback to work with large folios has a dependency on [1], which unfortunately does not look like it'll be resolved anytime soon. If we cannot remove tmp pages, then we'll likely need to use a different data structure than the rb tree to account for large folios w/ tmp pages. I believe we can still enable large folios overall even without large folios writeback, as even with the inode->i_mapping set to a large folio order range, writeback will still only operate on 4k folios until fgf_set_order() is explicitly set in fuse_write_begin() for the __filemap_get_folio() call. * There's a discussion here [2] about perf degradation for writeback writes on large folios due to writeback throttling when balancing dirty pages. This is due to fuse enabling bdi strictlimit. More experimentation will be needed to figure out what a good folio order is, and whether it's possible to do something like remove the strictlimit for privileged servers. * Writeback on FUSE will need support for more granular dirty tracking, so that we don't have to write back the entire large folio if only a few pages in it are dirtied. I'm planning to take a look at iomap and netfs and see if maybe FUSE can hook into that for it. Thanks, Joanne [1] https://lore.kernel.org/linux-fsdevel/20241122232359.429647-1-joannelkoong@xxxxxxxxx/ [2] https://lore.kernel.org/linux-fsdevel/CAJnrk1a38pv3OgFZRfdTiDMXuPWuBgN8KY47XfOsYHj=N2wxAg@xxxxxxxxxxxxxx/ > Changelog: > v2: https://lore.kernel.org/linux-fsdevel/20241125220537.3663725-1-joannelkoong@xxxxxxxxx/ > v2 -> v3: > * Fix direct io parsing to check each extracted page instead of assuming all > pages in a large folio will be used (Matthew) > > v1: https://lore.kernel.org/linux-fsdevel/20241109001258.2216604-1-joannelkoong@xxxxxxxxx/ > v1 -> v2: > * Change naming from "non-writeback write" to "writethrough write" > * Fix deadlock for writethrough writes by calling fault_in_iov_iter_readable() > * first > before __filemap_get_folio() (Josef) > * For readahead, retain original folio_size() for descs.length (Josef) > * Use folio_zero_range() api in fuse_copy_folio() (Josef) > * Add Josef's reviewed-bys > > Joanne Koong (12): > fuse: support copying large folios > fuse: support large folios for retrieves > fuse: refactor fuse_fill_write_pages() > fuse: support large folios for writethrough writes > fuse: support large folios for folio reads > fuse: support large folios for symlinks > fuse: support large folios for stores > fuse: support large folios for queued writes > fuse: support large folios for readahead > fuse: optimize direct io large folios processing > fuse: support large folios for writeback > fuse: enable large folios > > fs/fuse/dev.c | 128 ++++++++++++++++++++++--------------------- > fs/fuse/dir.c | 8 +-- > fs/fuse/file.c | 144 +++++++++++++++++++++++++++++++++---------------- > 3 files changed, 166 insertions(+), 114 deletions(-) > > -- > 2.43.5 >