Re: [PATCH v2 2/2] fuse: remove tmp folio for writebacks and internal rb tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 15, 2024 at 3:01 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
> On Mon, 14 Oct 2024 at 20:23, Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
>
> > This change sets AS_NO_WRITEBACK_RECLAIM on the inode mapping so that
> > FUSE folios are not reclaimed and waited on while in writeback, and
> > removes the temporary folio + extra copying and the internal rb tree.
>
> What about sync(2)?   And page migration?
>
> Hopefully there are no other cases, but I think a careful review of
> places where generic code waits for writeback is needed before we can
> say for sure.

The places where I see this potential deadlock still being possible are:
* page migration when handling a page fault:
     In particular, this path: handle_mm_fault() ->
__handle_mm_fault() -> handle_pte_fault() -> do_numa_page() ->
migrate_misplaced_folio() -> migrate_pages() -> migrate_pages_sync()
-> migrate_pages_batch() -> migrate_folio_unmap() ->
folio_wait_writeback()
* syscalls that trigger waits on writeback, which will lead to
deadlock if a single-threaded fuse server calls this when servicing
requests:
    - sync(), sync_file_range(), fsync(), fdatasync()
    - swapoff()
    - move_pages()

I need to analyze the page fault path more to get a clearer picture of
what is happening, but so far this looks like a valid case for a
correctly written fuse server to run into.
For the syscalls however, is it valid/safe in general (disregarding
the writeback deadlock scenario for a minute) for fuse servers to be
invoking these syscalls in their handlers anyways?

The other places where I see a generic wait on writeback seem safe:
* splice, page_cache_pipe_buf_try_steal() (fs/splice.c):
   We hit this in fuse when we try to move a page from the pipe buffer
into the page cache (fuse_try_move_page()) for the SPLICE_F_MOVE case.
This wait seems fine, since the folio that's being waited on is the
folio in the pipe buffer which is not a fuse folio.
* memory failure (mm/memory_failure.c):
   Soft offlining a page and handling page memory failure - these can
be triggered asynchronously (memory_failure_work_func()), but this
should be fine for the fuse use case since the server isn't blocked on
servicing any writeback requests while memory failure handling is
waiting on writeback
* page truncation (mm/truncate.c):
   Same here. These cases seem fine since the server isn't blocked on
servicing writeback requests while truncation waits on writeback


Thanks,
Joanne

>
> Thanks,
> Miklos





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux