Hi, Joanne, Thanks for the continuing work! On 11/8/24 7:56 AM, Joanne Koong wrote: > Currently, we allocate and copy data to a temporary folio when > handling writeback in order to mitigate the following deadlock scenario > that may arise if reclaim waits on writeback to complete: > * single-threaded FUSE server is in the middle of handling a request > that needs a memory allocation > * memory allocation triggers direct reclaim > * direct reclaim waits on a folio under writeback > * the FUSE server can't write back the folio since it's stuck in > direct reclaim > > To work around this, we allocate a temporary folio and copy over the > original folio to the temporary folio so that writeback can be > immediately cleared on the original folio. This additionally requires us > to maintain an internal rb tree to keep track of writeback state on the > temporary folios. > > A recent change prevents reclaim logic from waiting on writeback for > folios whose mappings have the AS_WRITEBACK_MAY_BLOCK flag set in it. > This commit sets AS_WRITEBACK_MAY_BLOCK on FUSE inode mappings (which > will prevent FUSE folios from running into the reclaim deadlock described > above) and removes the temporary folio + extra copying and the internal > rb tree. > > fio benchmarks -- > (using averages observed from 10 runs, throwing away outliers) > > Setup: > sudo mount -t tmpfs -o size=30G tmpfs ~/tmp_mount > ./libfuse/build/example/passthrough_ll -o writeback -o max_threads=4 -o source=~/tmp_mount ~/fuse_mount > > fio --name=writeback --ioengine=sync --rw=write --bs={1k,4k,1M} --size=2G > --numjobs=2 --ramp_time=30 --group_reporting=1 --directory=/root/fuse_mount > > bs = 1k 4k 1M > Before 351 MiB/s 1818 MiB/s 1851 MiB/s > After 341 MiB/s 2246 MiB/s 2685 MiB/s > % diff -3% 23% 45% > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > @@ -1622,7 +1543,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > return res; > } > } > - if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) { > + if (!cuse && filemap_range_has_writeback(mapping, pos, (pos + count - 1))) { filemap_range_has_writeback() is not equivalent to fuse_range_is_writeback(), as it will return true as long as there's any locked or dirty page? I can't find an equivalent helper function at hand though. > @@ -3423,7 +3143,6 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags) > fi->iocachectr = 0; > init_waitqueue_head(&fi->page_waitq); > init_waitqueue_head(&fi->direct_io_waitq); > - fi->writepages = RB_ROOT; It seems that 'struct rb_root writepages' is not removed from fuse_inode structure. Besides, I also looked through the former 5 patches and can't find any obvious errors at the very first glance. Hopefully the MM guys could offer more professional reviews. -- Thanks, Jingbo