Re: [PATCH v4.6-rc] writeback: Fix performance regression in wb_over_bg_thresh()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/5/16, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> Hi Jens,
>
> This fix seems to have been missed; it should go into v4.6.
>
> Please apply.
>
> Thanks,
> Miklos
>
>
> From: Howard Cochran <hcochran@xxxxxxxxxxxxxxxx>
> Subject: writeback: Fix performance regression in wb_over_bg_thresh()
> Date: Thu, 10 Mar 2016 01:12:39 -0500
>
> Commit 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use
> wb_domain aware operations") unintentionally changed this function's
> meaning from "are there more dirty pages than the background writeback
> threshold" to "are there more dirty pages than the writeback threshold".
> The background writeback threshold is typically half of the writeback
> threshold, so this had the effect of raising the number of dirty pages
> required to cause a writeback worker to perform background writeout.
>
> This can cause a very severe performance regression when a BDI uses
> BDI_CAP_STRICTLIMIT because balance_dirty_pages() and the writeback worker
> can now disagree on whether writeback should be initiated.
>
> For example, in a system having 1GB of RAM, a single spinning disk, and a
> "pass-through" FUSE filesystem mounted over the disk, application code
> mmapped a 128MB file on the disk and was randomly dirtying pages in that
> mapping.
>
> Because FUSE uses strictlimit and has a default max_ratio of only 1%, in
> balance_dirty_pages, thresh is ~200, bg_thresh is ~100, and the
> dirty_freerun_ceiling is the average of those, ~150. So, it pauses the
> dirtying processes when we have 151 dirty pages and wakes up a background
> writeback worker. But the worker tests the wrong threshold (200 instead of
> 100), so it does not initiate writeback and just returns.
>
> Thus, balance_dirty_pages keeps looping, sleeping and then waking up the
> worker who will do nothing. It remains stuck in this state until the few
> dirty pages that we have finally expire and we write them back for that
> reason. Then the whole process repeats, resulting in near-zero throughput
> through the FUSE BDI.
>
> The fix is to call the parameterized variant of wb_calc_thresh, so that the
> worker will do writeback if the bg_thresh is exceeded which was the
> behavior before the referenced commit.
>
> Fixes: 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use wb_domain
> aware operations")
> Signed-off-by: Howard Cochran <hcochran@xxxxxxxxxxxxxxxx>
> Acked-by: Tejun Heo <tj@xxxxxxxxxx>
> Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx> # v4.2+

Fell free to add my...

      Tested-by Sedat Dilek <sedat.dilek@xxxxxxxxx>

- sed@ -

> ---
>  mm/page-writeback.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1910,7 +1910,8 @@ bool wb_over_bg_thresh(struct bdi_writeb
>  	if (gdtc->dirty > gdtc->bg_thresh)
>  		return true;
>
> -	if (wb_stat(wb, WB_RECLAIMABLE) > __wb_calc_thresh(gdtc))
> +	if (wb_stat(wb, WB_RECLAIMABLE) >
> +	    wb_calc_thresh(gdtc->wb, gdtc->bg_thresh))
>  		return true;
>
>  	if (mdtc) {
> @@ -1924,7 +1925,8 @@ bool wb_over_bg_thresh(struct bdi_writeb
>  		if (mdtc->dirty > mdtc->bg_thresh)
>  			return true;
>
> -		if (wb_stat(wb, WB_RECLAIMABLE) > __wb_calc_thresh(mdtc))
> +		if (wb_stat(wb, WB_RECLAIMABLE) >
> +		    wb_calc_thresh(mdtc->wb, mdtc->bg_thresh))
>  			return true;
>  	}
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux