Re: + mm-add-as_writeback_indeterminate-mapping-flag.patch added to mm-unstable branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 13, 2024 at 4:34 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>
> The patch titled
>      Subject: mm: add AS_WRITEBACK_INDETERMINATE mapping flag
> has been added to the -mm mm-unstable branch.  Its filename is
>      mm-add-as_writeback_indeterminate-mapping-flag.patch
>
> This patch will shortly appear at
>      https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-add-as_writeback_indeterminate-mapping-flag.patch
>
> This patch will later appear in the mm-unstable branch at
>     git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
>
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
>
> The -mm tree is included into linux-next via the mm-everything
> branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> and is updated there every 2-3 working days
>

Hi Andrew,

After the discussion in [1], I think this patchset needs to be
unmerged from mm-unstable. Could you please remove this patchset from
the tree?


Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/def423f4-fecd-4017-9bcb-74a5dbf0e9f5@xxxxxxxxxx/T/#m9f8343c359c72b70e3205ee606459c1f4e1646f4

> ------------------------------------------------------
> From: Joanne Koong <joannelkoong@xxxxxxxxx>
> Subject: mm: add AS_WRITEBACK_INDETERMINATE mapping flag
> Date: Fri, 22 Nov 2024 15:23:55 -0800
>
> Patch series "fuse: remove temp page copies in writeback", v6.
>
> The purpose of this patchset is to help make writeback-cache write
> performance in FUSE filesystems as fast as possible.
>
> In the current FUSE writeback design (see commit 3be5a52b30aa ("fuse:
> support writable mmap"))), a temp page is allocated for every dirty page
> to be written back, the contents of the dirty page are copied over to the
> temp page, and the temp page gets handed to the server to write back.
> This is done so that writeback may be immediately cleared on the dirty
> page, and this in turn is done for two reasons:
>
> a) in order to mitigate the following deadlock scenario that may arise
>    if reclaim waits on writeback on the dirty page to complete (more
>    details can be found in this thread [1]):
>
>    * single-threaded FUSE server is in the middle of handling a request
>      that needs a memory allocation
>    * memory allocation triggers direct reclaim
>    * direct reclaim waits on a folio under writeback
>    * the FUSE server can't write back the folio since it's stuck in
>      direct reclaim
>
> b) in order to unblock internal (eg sync, page compaction) waits on
>    writeback without needing the server to complete writing back to disk,
>    which may take an indeterminate amount of time.
>
> Allocating and copying dirty pages to temp pages is the biggest
> performance bottleneck for FUSE writeback.  This patchset aims to get rid
> of the temp page altogether (which will also allow us to get rid of the
> internal FUSE rb tree that is needed to keep track of writeback status on
> the temp pages).  Benchmarks show approximately a 20% improvement in
> throughput for 4k block-size writes and a 45% improvement for 1M
> block-size writes.
>
> With removing the temp page, writeback state is now only cleared on the
> dirty page after the server has written it back to disk.  This may take an
> indeterminate amount of time.  As well, there is also the possibility of
> malicious or well-intentioned but buggy servers where writeback may in the
> worst case scenario, never complete.  This means that any
> folio_wait_writeback() on a dirty page belonging to a FUSE filesystem
> needs to be carefully audited.
>
> In particular, these are the cases that need to be accounted for:
> * potentially deadlocking in reclaim, as mentioned above
> * potentially stalling sync(2)
> * potentially stalling page migration / compaction
>
> This patchset adds a new mapping flag, AS_WRITEBACK_INDETERMINATE, which
> filesystems may set on its inode mappings to indicate that writeback
> operations may take an indeterminate amount of time to complete.  FUSE
> will set this flag on its mappings.  This patchset adds checks to the
> critical parts of reclaim, sync, and page migration logic where writeback
> may be waited on.
>
> Please note the following:
> * For sync(2), waiting on writeback will be skipped for FUSE, but this has no
>   effect on existing behavior. Dirty FUSE pages are already not guaranteed to
>   be written to disk by the time sync(2) returns (eg writeback is cleared on
>   the dirty page but the server may not have written out the temp page to disk
>   yet). If the caller wishes to ensure the data has actually been synced to
>   disk, they should use fsync(2)/fdatasync(2) instead.
> * AS_WRITEBACK_INDETERMINATE does not indicate that the folios should never be
>   waited on when in writeback. There are some cases where the wait is
>   desirable. For example, for the sync_file_range() syscall, it is fine to
>   wait on the writeback since the caller passes in a fd for the operation.
>
> [1] https://lore.kernel.org/linux-kernel/495d2400-1d96-4924-99d3-8b2952e05fc3@xxxxxxxxxxxxxxxxx/
>
>
> This patch (of 5):
>
> Add a new mapping flag AS_WRITEBACK_INDETERMINATE which filesystems may
> set to indicate that writing back to disk may take an indeterminate amount
> of time to complete.  Extra caution should be taken when waiting on
> writeback for folios belonging to mappings where this flag is set.
>
> Link: https://lkml.kernel.org/r/20241122232359.429647-1-joannelkoong@xxxxxxxxx
> Link: https://lkml.kernel.org/r/20241122232359.429647-2-joannelkoong@xxxxxxxxx
> Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> Reviewed-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>
> Acked-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
> Cc: Bernd Schubert <bernd.schubert@xxxxxxxxxxx>
> Cc: Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx>
> Cc: Josef Bacik <josef@xxxxxxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
>
>  include/linux/pagemap.h |   11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> --- a/include/linux/pagemap.h~mm-add-as_writeback_indeterminate-mapping-flag
> +++ a/include/linux/pagemap.h
> @@ -210,6 +210,7 @@ enum mapping_flags {
>         AS_STABLE_WRITES = 7,   /* must wait for writeback before modifying
>                                    folio contents */
>         AS_INACCESSIBLE = 8,    /* Do not attempt direct R/W access to the mapping */
> +       AS_WRITEBACK_INDETERMINATE = 9, /* Use caution when waiting on writeback */
>         /* Bits 16-25 are used for FOLIO_ORDER */
>         AS_FOLIO_ORDER_BITS = 5,
>         AS_FOLIO_ORDER_MIN = 16,
> @@ -335,6 +336,16 @@ static inline bool mapping_inaccessible(
>         return test_bit(AS_INACCESSIBLE, &mapping->flags);
>  }
>
> +static inline void mapping_set_writeback_indeterminate(struct address_space *mapping)
> +{
> +       set_bit(AS_WRITEBACK_INDETERMINATE, &mapping->flags);
> +}
> +
> +static inline bool mapping_writeback_indeterminate(struct address_space *mapping)
> +{
> +       return test_bit(AS_WRITEBACK_INDETERMINATE, &mapping->flags);
> +}
> +
>  static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
>  {
>         return mapping->gfp_mask;
> _
>
> Patches currently in -mm which might be from joannelkoong@xxxxxxxxx are
>
> mm-add-as_writeback_indeterminate-mapping-flag.patch
> mm-skip-reclaiming-folios-in-legacy-memcg-writeback-indeterminate-contexts.patch
> fs-writeback-in-wait_sb_inodes-skip-wait-for-as_writeback_indeterminate-mappings.patch
> mm-migrate-skip-migrating-folios-under-writeback-with-as_writeback_indeterminate-mappings.patch
> fuse-remove-tmp-folio-for-writebacks-and-internal-rb-tree.patch
>





[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux