On Tue, Oct 15, 2024 at 3:01 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > > On Mon, 14 Oct 2024 at 20:23, Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > > This change sets AS_NO_WRITEBACK_RECLAIM on the inode mapping so that > > FUSE folios are not reclaimed and waited on while in writeback, and > > removes the temporary folio + extra copying and the internal rb tree. > > What about sync(2)? And page migration? > > Hopefully there are no other cases, but I think a careful review of > places where generic code waits for writeback is needed before we can > say for sure. The places where I see this potential deadlock still being possible are: * page migration when handling a page fault: In particular, this path: handle_mm_fault() -> __handle_mm_fault() -> handle_pte_fault() -> do_numa_page() -> migrate_misplaced_folio() -> migrate_pages() -> migrate_pages_sync() -> migrate_pages_batch() -> migrate_folio_unmap() -> folio_wait_writeback() * syscalls that trigger waits on writeback, which will lead to deadlock if a single-threaded fuse server calls this when servicing requests: - sync(), sync_file_range(), fsync(), fdatasync() - swapoff() - move_pages() I need to analyze the page fault path more to get a clearer picture of what is happening, but so far this looks like a valid case for a correctly written fuse server to run into. For the syscalls however, is it valid/safe in general (disregarding the writeback deadlock scenario for a minute) for fuse servers to be invoking these syscalls in their handlers anyways? The other places where I see a generic wait on writeback seem safe: * splice, page_cache_pipe_buf_try_steal() (fs/splice.c): We hit this in fuse when we try to move a page from the pipe buffer into the page cache (fuse_try_move_page()) for the SPLICE_F_MOVE case. This wait seems fine, since the folio that's being waited on is the folio in the pipe buffer which is not a fuse folio. * memory failure (mm/memory_failure.c): Soft offlining a page and handling page memory failure - these can be triggered asynchronously (memory_failure_work_func()), but this should be fine for the fuse use case since the server isn't blocked on servicing any writeback requests while memory failure handling is waiting on writeback * page truncation (mm/truncate.c): Same here. These cases seem fine since the server isn't blocked on servicing writeback requests while truncation waits on writeback Thanks, Joanne > > Thanks, > Miklos