Re: [for-416 PATCH] bcache: fix writeback target calc on large devices

tang.junhui@xxxxxxxxxx · Tue, 2 Jan 2018 14:33:51 +0800

From: Tang Junhui <tang.junhui@xxxxxxxxxx>

This patch is useful for preventing the overflow of the expression
(cache_dirty_target * bdev_sectors(dc->bdev)), but it also
lead into a calc error, for example, when there is a 1G and
100*164G cached device, it would cause the "target" value to
be aways zero of the 1G device, which would cause write-back 
threshold losing efficacy.

Maybe at first we can judge if it overflows or not of the expression 
(cache_dirty_target * bdev_sectors(dc->bdev)), if it overflows,
We can calc the value of target as the patch, otherwise, 
we calc it as old way.

>Bcache needs to scale the dirty data in the cache over the multiple
>backing disks in order to calculate writeback rates for each.
>The previous code did this by multiplying the target number of dirty
>sectors by the backing device size, and expected it to fit into a
>uint64_t; this blows up on relatively small backing devices.
>
>The new approach figures out the bdev's share in 16384ths of the overall
>cached data.  This is chosen to cope well when bdevs drastically vary in
>size and to ensure that bcache can cross the petabyte boundary for each
>backing device.
>
>Reported-by: Jack Douglas <jack@xxxxxxxxxxxxxxxxxxxxxxx>
>Signed-off-by: Michael Lyle <mlyle@xxxxxxxx>
>---
> drivers/md/bcache/writeback.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>index 56a37884ca8b..ddbbeec1f0ee 100644
>--- a/drivers/md/bcache/writeback.c
>+++ b/drivers/md/bcache/writeback.c
>@@ -24,10 +24,23 @@ static void __update_writeback_rate(struct cached_dev *dc)
>     struct cache_set *c = dc->disk.c;
>     uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size -
>                 bcache_flash_devs_sectors_dirty(c);
>+    /*
>+     * Unfortunately we don't know the exact share of dirty data for
>+     * each backing device.  Therefore, we need to infer the writeback
>+     * for each disk based on its assumed proportion of dirty data.
>+     *
>+     * 16384 is chosen here as something that each backing device should
>+     * be a reasonable fraction of the share, and not to blow up until
>+     * individual backing devices are a petabyte.
>+     */
>+    uint32_t bdev_share_per16k =
>+        div64_u64(bdev_sectors(dc->bdev) << 14,
>+                c->cached_dev_sectors);
>+
>     uint64_t cache_dirty_target =
>         div_u64(cache_sectors * dc->writeback_percent, 100);
>-    int64_t target = div64_u64(cache_dirty_target * bdev_sectors(dc->bdev),
>-                   c->cached_dev_sectors);
>+
>+    int64_t target = (cache_dirty_target * bdev_share_per16k) >> 14;
> 
>     /*
>      * PI controller:
>-- 
>2.14.1
>

Thanks,
Tang Junhui
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html