Re: [PATCH] btrfs: Use the normal writeback path for flushing delalloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 4, 2021 at 5:06 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> This is a revert for 38d715f494f2 ("btrfs: use
> btrfs_start_delalloc_roots in shrink_delalloc").  A user reported a
> problem where performance was significantly worse with this patch
> applied.  The problem needs to be fixed with proper pre-flushing, and
> changes to how we deal with the work queues for the inodes.  However
> that work is much more complicated than is acceptable for stable, and
> simply reverting this patch fixes the problem.  The original patch was
> a cleanup of the code, so it's fine to revert it.  My numbers for the
> original reported test, which was untarring a copy of the firefox
> sources, are as follows
>
> 5.9     0m54.258s
> 5.10    1m26.212s
> Fix     0m35.038s
>
> cc: stable@xxxxxxxxxxxxxxx # 5.10
> Reported-by: René Rebe <rene@xxxxxxxxxxxx>
> Fixes: 38d715f494f2 ("btrfs: use btrfs_start_delalloc_roots in shrink_delalloc")
> Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
> ---
> Dave, this is ontop of linus's branch, because we've changed the arguments for
> btrfs_start_delalloc_roots in misc-next, and this needs to go back to 5.10 ASAP.
> I can send a misc-next version if you want to have it there as well while we're
> waiting for it to go into linus's tree, just let me know.

Adding this to stable releases will also make the following fix not
work on stable releases:

https://lore.kernel.org/linux-btrfs/39c2a60aa682f69f9823f51aa119d37ef4b9f834.1606909923.git.fdmanana@xxxxxxxx/

Since now the async reclaim task can trigger writeback through
writeback_inodes_sb_nr() and not only through
btrfs_start_delalloc_roots().
Other than changing that patch to make extent_write_cache_pages() do
nothing when the inode has the bit BTRFS_INODE_NO_DELALLOC_FLUSH set,
I'm not seeing other simple ways to do it.

Thanks.

>
>  fs/btrfs/space-info.c | 54 ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 53 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 64099565ab8f..a2b322275b8d 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -465,6 +465,28 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
>         up_read(&info->groups_sem);
>  }
>
> +static void btrfs_writeback_inodes_sb_nr(struct btrfs_fs_info *fs_info,
> +                                        unsigned long nr_pages, u64 nr_items)
> +{
> +       struct super_block *sb = fs_info->sb;
> +
> +       if (down_read_trylock(&sb->s_umount)) {
> +               writeback_inodes_sb_nr(sb, nr_pages, WB_REASON_FS_FREE_SPACE);
> +               up_read(&sb->s_umount);
> +       } else {
> +               /*
> +                * We needn't worry the filesystem going from r/w to r/o though
> +                * we don't acquire ->s_umount mutex, because the filesystem
> +                * should guarantee the delalloc inodes list be empty after
> +                * the filesystem is readonly(all dirty pages are written to
> +                * the disk).
> +                */
> +               btrfs_start_delalloc_roots(fs_info, nr_items);
> +               if (!current->journal_info)
> +                       btrfs_wait_ordered_roots(fs_info, nr_items, 0, (u64)-1);
> +       }
> +}
> +
>  static inline u64 calc_reclaim_items_nr(struct btrfs_fs_info *fs_info,
>                                         u64 to_reclaim)
>  {
> @@ -490,8 +512,10 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
>         struct btrfs_trans_handle *trans;
>         u64 delalloc_bytes;
>         u64 dio_bytes;
> +       u64 async_pages;
>         u64 items;
>         long time_left;
> +       unsigned long nr_pages;
>         int loops;
>
>         /* Calc the number of the pages we need flush for space reservation */
> @@ -532,8 +556,36 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
>
>         loops = 0;
>         while ((delalloc_bytes || dio_bytes) && loops < 3) {
> -               btrfs_start_delalloc_roots(fs_info, items);
> +               nr_pages = min(delalloc_bytes, to_reclaim) >> PAGE_SHIFT;
> +
> +               /*
> +                * Triggers inode writeback for up to nr_pages. This will invoke
> +                * ->writepages callback and trigger delalloc filling
> +                *  (btrfs_run_delalloc_range()).
> +                */
> +               btrfs_writeback_inodes_sb_nr(fs_info, nr_pages, items);
> +               /*
> +                * We need to wait for the compressed pages to start before
> +                * we continue.
> +                */
> +               async_pages = atomic_read(&fs_info->async_delalloc_pages);
> +               if (!async_pages)
> +                       goto skip_async;
> +
> +               /*
> +                * Calculate how many compressed pages we want to be written
> +                * before we continue. I.e if there are more async pages than we
> +                * require wait_event will wait until nr_pages are written.
> +                */
> +               if (async_pages <= nr_pages)
> +                       async_pages = 0;
> +               else
> +                       async_pages -= nr_pages;
>
> +               wait_event(fs_info->async_submit_wait,
> +                          atomic_read(&fs_info->async_delalloc_pages) <=
> +                          (int)async_pages);
> +skip_async:
>                 loops++;
>                 if (wait_ordered && !trans) {
>                         btrfs_wait_ordered_roots(fs_info, items, 0, (u64)-1);
> --
> 2.26.2
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux