Re: [PATCH v4 6/6] fuse: remove tmp folio for writebacks and internal rb tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 8, 2024 at 12:49 AM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi, Joanne,
>
> Thanks for the continuing work!
>
> On 11/8/24 7:56 AM, Joanne Koong wrote:
> > Currently, we allocate and copy data to a temporary folio when
> > handling writeback in order to mitigate the following deadlock scenario
> > that may arise if reclaim waits on writeback to complete:
> > * single-threaded FUSE server is in the middle of handling a request
> >   that needs a memory allocation
> > * memory allocation triggers direct reclaim
> > * direct reclaim waits on a folio under writeback
> > * the FUSE server can't write back the folio since it's stuck in
> >   direct reclaim
> >
> > To work around this, we allocate a temporary folio and copy over the
> > original folio to the temporary folio so that writeback can be
> > immediately cleared on the original folio. This additionally requires us
> > to maintain an internal rb tree to keep track of writeback state on the
> > temporary folios.
> >
> > A recent change prevents reclaim logic from waiting on writeback for
> > folios whose mappings have the AS_WRITEBACK_MAY_BLOCK flag set in it.
> > This commit sets AS_WRITEBACK_MAY_BLOCK on FUSE inode mappings (which
> > will prevent FUSE folios from running into the reclaim deadlock described
> > above) and removes the temporary folio + extra copying and the internal
> > rb tree.
> >
> > fio benchmarks --
> > (using averages observed from 10 runs, throwing away outliers)
> >
> > Setup:
> > sudo mount -t tmpfs -o size=30G tmpfs ~/tmp_mount
> >  ./libfuse/build/example/passthrough_ll -o writeback -o max_threads=4 -o source=~/tmp_mount ~/fuse_mount
> >
> > fio --name=writeback --ioengine=sync --rw=write --bs={1k,4k,1M} --size=2G
> > --numjobs=2 --ramp_time=30 --group_reporting=1 --directory=/root/fuse_mount
> >
> >         bs =  1k          4k            1M
> > Before  351 MiB/s     1818 MiB/s     1851 MiB/s
> > After   341 MiB/s     2246 MiB/s     2685 MiB/s
> > % diff        -3%          23%         45%
> >
> > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
>
>
> > @@ -1622,7 +1543,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
> >                       return res;
> >               }
> >       }
> > -     if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) {
> > +     if (!cuse && filemap_range_has_writeback(mapping, pos, (pos + count - 1))) {
>
> filemap_range_has_writeback() is not equivalent to
> fuse_range_is_writeback(), as it will return true as long as there's any
> locked or dirty page?  I can't find an equivalent helper function at
> hand though.
>

Hi Jingbo,

I couldn't find an equivalent helper function either. My guess is that
filemap_range_has_writeback() returns true if the page is locked
because it doesn't have a way of determining if the page is dirty or
not (it seems like if a page is locked, then we can't read the
writeback bit on it) so it errs on the side of assuming yes.
For this case, it seems okay to me to use
filemap_range_has_writeback() because if we get back a false positive
(eg filemap_range_has_writeback() returns true when it's actually
false), the only cost is the overhead of an additional
fuse_sync_writes() call but fuse_sync_writes() will return immediately
from the wait().

>
>
> > @@ -3423,7 +3143,6 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags)
> >       fi->iocachectr = 0;
> >       init_waitqueue_head(&fi->page_waitq);
> >       init_waitqueue_head(&fi->direct_io_waitq);
> > -     fi->writepages = RB_ROOT;
>
> It seems that 'struct rb_root writepages' is not removed from fuse_inode
> structure.
>

Nice catch! I'll remove this from the fuse_inode struct in v5.

>
> Besides, I also looked through the former 5 patches and can't find any
> obvious errors at the very first glance.  Hopefully the MM guys could
> offer more professional reviews.
>

Thanks for looking through this code in this version and the past
versions of this patchset too. It's much appreciated!

> --
> Thanks,
> Jingbo





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux