On Fri, Dec 13, 2024 at 4:34 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > > The patch titled > Subject: mm: add AS_WRITEBACK_INDETERMINATE mapping flag > has been added to the -mm mm-unstable branch. Its filename is > mm-add-as_writeback_indeterminate-mapping-flag.patch > > This patch will shortly appear at > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-add-as_writeback_indeterminate-mapping-flag.patch > > This patch will later appear in the mm-unstable branch at > git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm > > Before you just go and hit "reply", please: > a) Consider who else should be cc'ed > b) Prefer to cc a suitable mailing list as well > c) Ideally: find the original patch on the mailing list and do a > reply-to-all to that, adding suitable additional cc's > > *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** > > The -mm tree is included into linux-next via the mm-everything > branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm > and is updated there every 2-3 working days > Hi Andrew, After the discussion in [1], I think this patchset needs to be unmerged from mm-unstable. Could you please remove this patchset from the tree? Thanks, Joanne [1] https://lore.kernel.org/linux-fsdevel/def423f4-fecd-4017-9bcb-74a5dbf0e9f5@xxxxxxxxxx/T/#m9f8343c359c72b70e3205ee606459c1f4e1646f4 > ------------------------------------------------------ > From: Joanne Koong <joannelkoong@xxxxxxxxx> > Subject: mm: add AS_WRITEBACK_INDETERMINATE mapping flag > Date: Fri, 22 Nov 2024 15:23:55 -0800 > > Patch series "fuse: remove temp page copies in writeback", v6. > > The purpose of this patchset is to help make writeback-cache write > performance in FUSE filesystems as fast as possible. > > In the current FUSE writeback design (see commit 3be5a52b30aa ("fuse: > support writable mmap"))), a temp page is allocated for every dirty page > to be written back, the contents of the dirty page are copied over to the > temp page, and the temp page gets handed to the server to write back. > This is done so that writeback may be immediately cleared on the dirty > page, and this in turn is done for two reasons: > > a) in order to mitigate the following deadlock scenario that may arise > if reclaim waits on writeback on the dirty page to complete (more > details can be found in this thread [1]): > > * single-threaded FUSE server is in the middle of handling a request > that needs a memory allocation > * memory allocation triggers direct reclaim > * direct reclaim waits on a folio under writeback > * the FUSE server can't write back the folio since it's stuck in > direct reclaim > > b) in order to unblock internal (eg sync, page compaction) waits on > writeback without needing the server to complete writing back to disk, > which may take an indeterminate amount of time. > > Allocating and copying dirty pages to temp pages is the biggest > performance bottleneck for FUSE writeback. This patchset aims to get rid > of the temp page altogether (which will also allow us to get rid of the > internal FUSE rb tree that is needed to keep track of writeback status on > the temp pages). Benchmarks show approximately a 20% improvement in > throughput for 4k block-size writes and a 45% improvement for 1M > block-size writes. > > With removing the temp page, writeback state is now only cleared on the > dirty page after the server has written it back to disk. This may take an > indeterminate amount of time. As well, there is also the possibility of > malicious or well-intentioned but buggy servers where writeback may in the > worst case scenario, never complete. This means that any > folio_wait_writeback() on a dirty page belonging to a FUSE filesystem > needs to be carefully audited. > > In particular, these are the cases that need to be accounted for: > * potentially deadlocking in reclaim, as mentioned above > * potentially stalling sync(2) > * potentially stalling page migration / compaction > > This patchset adds a new mapping flag, AS_WRITEBACK_INDETERMINATE, which > filesystems may set on its inode mappings to indicate that writeback > operations may take an indeterminate amount of time to complete. FUSE > will set this flag on its mappings. This patchset adds checks to the > critical parts of reclaim, sync, and page migration logic where writeback > may be waited on. > > Please note the following: > * For sync(2), waiting on writeback will be skipped for FUSE, but this has no > effect on existing behavior. Dirty FUSE pages are already not guaranteed to > be written to disk by the time sync(2) returns (eg writeback is cleared on > the dirty page but the server may not have written out the temp page to disk > yet). If the caller wishes to ensure the data has actually been synced to > disk, they should use fsync(2)/fdatasync(2) instead. > * AS_WRITEBACK_INDETERMINATE does not indicate that the folios should never be > waited on when in writeback. There are some cases where the wait is > desirable. For example, for the sync_file_range() syscall, it is fine to > wait on the writeback since the caller passes in a fd for the operation. > > [1] https://lore.kernel.org/linux-kernel/495d2400-1d96-4924-99d3-8b2952e05fc3@xxxxxxxxxxxxxxxxx/ > > > This patch (of 5): > > Add a new mapping flag AS_WRITEBACK_INDETERMINATE which filesystems may > set to indicate that writing back to disk may take an indeterminate amount > of time to complete. Extra caution should be taken when waiting on > writeback for folios belonging to mappings where this flag is set. > > Link: https://lkml.kernel.org/r/20241122232359.429647-1-joannelkoong@xxxxxxxxx > Link: https://lkml.kernel.org/r/20241122232359.429647-2-joannelkoong@xxxxxxxxx > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > Reviewed-by: Shakeel Butt <shakeel.butt@xxxxxxxxx> > Acked-by: Miklos Szeredi <mszeredi@xxxxxxxxxx> > Cc: Bernd Schubert <bernd.schubert@xxxxxxxxxxx> > Cc: Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> > Cc: Josef Bacik <josef@xxxxxxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > include/linux/pagemap.h | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > --- a/include/linux/pagemap.h~mm-add-as_writeback_indeterminate-mapping-flag > +++ a/include/linux/pagemap.h > @@ -210,6 +210,7 @@ enum mapping_flags { > AS_STABLE_WRITES = 7, /* must wait for writeback before modifying > folio contents */ > AS_INACCESSIBLE = 8, /* Do not attempt direct R/W access to the mapping */ > + AS_WRITEBACK_INDETERMINATE = 9, /* Use caution when waiting on writeback */ > /* Bits 16-25 are used for FOLIO_ORDER */ > AS_FOLIO_ORDER_BITS = 5, > AS_FOLIO_ORDER_MIN = 16, > @@ -335,6 +336,16 @@ static inline bool mapping_inaccessible( > return test_bit(AS_INACCESSIBLE, &mapping->flags); > } > > +static inline void mapping_set_writeback_indeterminate(struct address_space *mapping) > +{ > + set_bit(AS_WRITEBACK_INDETERMINATE, &mapping->flags); > +} > + > +static inline bool mapping_writeback_indeterminate(struct address_space *mapping) > +{ > + return test_bit(AS_WRITEBACK_INDETERMINATE, &mapping->flags); > +} > + > static inline gfp_t mapping_gfp_mask(struct address_space * mapping) > { > return mapping->gfp_mask; > _ > > Patches currently in -mm which might be from joannelkoong@xxxxxxxxx are > > mm-add-as_writeback_indeterminate-mapping-flag.patch > mm-skip-reclaiming-folios-in-legacy-memcg-writeback-indeterminate-contexts.patch > fs-writeback-in-wait_sb_inodes-skip-wait-for-as_writeback_indeterminate-mappings.patch > mm-migrate-skip-migrating-folios-under-writeback-with-as_writeback_indeterminate-mappings.patch > fuse-remove-tmp-folio-for-writebacks-and-internal-rb-tree.patch >