On Fri, Nov 8, 2024 at 4:22 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > On Fri, Nov 8, 2024 at 4:13 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > > This patchset adds support for folios larger than one page size in FUSE. > > > > This patchset is rebased on top of the (unmerged) patchset that removes temp > > folios in writeback [1]. (There is also a version of this patchset that is > > independent from that change, but that version has two additional patches > > needed to account for temp folios and temp folio copying, which may require > > some debate to get the API right for as these two patches add generic > > (non-FUSE) helpers. For simplicity's sake for now, I sent out this patchset > > version rebased on top of the patchset that removes temp pages) > > > > This patchset was tested by running it through fstests on passthrough_hp. > > Will be updating this thread with some fio benchmark results early next week. > For reads I'm seeing about a ~45% increase in throughput. This is the setup I used: -- Set up server -- ./libfuse/build/example/passthrough_hp --bypass-rw=1 ~/libfuse ~/mounts/fuse/ --nopassthrough (using libfuse patched with Bernd's passthrough_hp benchmark pr https://github.com/libfuse/libfuse/pull/807) -- Run fio -- fio --name=read --ioengine=sync --rw=read --bs=1M --size=1G --numjobs=2 --ramp_time=30 --group_reporting=1 --directory=mounts/fuse/ I tested on 2 machines and saw the following: Machine 1: No large folios: ~4400 MiB/s Large folios: ~7100 MiB/s Machine 2: No large folios: ~3700 MiB/s Large folios: ~6400 MiB/s I also did see variability (on both ends) between runs and threw away outliers. For writes, we're still sending out one page folios (see thread on the 4th patch in this series [1]), so there is no difference. Benchmarks showed that trying to get the largest folios possible from __filemap_get_folio() is an over-optimization and ends up being significantly more expensive. I think it'd probably be an improvement if we set some reasonably sized order to the __filemap_get_folio() call (order 2?), but that can be optimized in the future in another patchset. [1] https://lore.kernel.org/linux-fsdevel/CAJnrk1aPVwNmv2uxYLwtdwGqe=QUROUXmZc8BiLAV=uqrnCrrw@xxxxxxxxxxxxxx/ > > > > [1] https://lore.kernel.org/linux-fsdevel/20241107235614.3637221-1-joannelkoong@xxxxxxxxx/ > > > > Joanne Koong (12): > > fuse: support copying large folios > > fuse: support large folios for retrieves > > fuse: refactor fuse_fill_write_pages() > > fuse: support large folios for non-writeback writes > > fuse: support large folios for folio reads > > fuse: support large folios for symlinks > > fuse: support large folios for stores > > fuse: support large folios for queued writes > > fuse: support large folios for readahead > > fuse: support large folios for direct io > > fuse: support large folios for writeback > > fuse: enable large folios > > > > fs/fuse/dev.c | 131 +++++++++++++++++++++++----------------------- > > fs/fuse/dir.c | 8 +-- > > fs/fuse/file.c | 138 +++++++++++++++++++++++++++++++------------------ > > 3 files changed, 159 insertions(+), 118 deletions(-) > > > > -- > > 2.43.5 > >