On Tue, Jan 14, 2025 at 10:58 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > On Tue, 14 Jan 2025 at 19:08, Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > - my understanding is that the majority of use cases do use splice (eg > > iirc, libfuse does as well), in which case there's no point to this > > patchset then > > If it turns out that non-splice writes are more performant, then > libfuse can be fixed to use non-splice by default. It's not as clear > cut though, since write through (which is also the default in libfuse, > AFAIK) should not be affected by all this, since that never used tmp > pages. My thinking was that spliced writes without tmp pages would be fastest, then non-splice writes w/out tmp pages and spliced writes w/ would be roughly the same. But i'd need to benchmark and verify this assumption. > > > - codewise, imo this gets messy (eg we would still need the rb tree > > and would now need to check writeback against folio writeback state > > and against the rb tree) > > I'm thinking of something slightly different: remove the current tmp > page mess, but instead of duplicating a page ref on splice, fall back > to copying the cache page (see the user_pages case in > fuse_copy_page()). This should have very similar performance to what > we have today, but allows us to deal with page accesses the same way > for both regular and splice I/O on /dev/fuse. If we copy the cache page, do we not have the same issue with needing an rb tree to track writeback state since writeback on the original folio would be immediately cleared? Thanks, Joanne > > Thanks, > Miklos