If a bio is throttled and splitted after throttling, the bio could be resubmited and enters the throttling again. This will cause part of the bio is charged multiple times. If the cgroup has an IO limit, the double charge will significantly harm the performance. The bio split becomes quite common after arbitrary bio size change. To fix this, we record the disk info a bio is throttled against. If a bio is throttled and issued, we record the info. We copy the info to cloned bio, so cloned bio (including splitted bio) will not be throttled again. Stacked block device driver will change cloned bio's bi_disk, if a bio's bi_disk is changed, the recorded throttle disk info is invalid, we should throttle again. That's the reason why we can't use a single bit to indicate if a cloned bio should be throttled. We only record gendisk here, if a cloned bio is remapped to other disk, it's very unlikely only partno is changed. Some sort of this patch probably should go into stable since v4.2 Cc: Tejun Heo <tj@xxxxxxxxxx> Cc: Vivek Goyal <vgoyal@xxxxxxxxxx> Signed-off-by: Shaohua Li <shli@xxxxxx> --- block/bio.c | 3 +++ block/blk-throttle.c | 15 ++++++++++++--- include/linux/blk_types.h | 4 ++++ 3 files changed, 19 insertions(+), 3 deletions(-) diff --git a/block/bio.c b/block/bio.c index 8338304..dce8314 100644 --- a/block/bio.c +++ b/block/bio.c @@ -597,6 +597,9 @@ void __bio_clone_fast(struct bio *bio, struct bio *bio_src) * so we don't set nor calculate new physical/hw segment counts here */ bio->bi_disk = bio_src->bi_disk; +#ifdef CONFIG_BLK_DEV_THROTTLING + bio->bi_throttled_disk = bio_src->bi_throttled_disk; +#endif bio_set_flag(bio, BIO_CLONED); bio->bi_opf = bio_src->bi_opf; bio->bi_write_hint = bio_src->bi_write_hint; diff --git a/block/blk-throttle.c b/block/blk-throttle.c index ee6d7b0..155549a 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2130,9 +2130,15 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, WARN_ON_ONCE(!rcu_read_lock_held()); - /* see throtl_charge_bio() */ - if (bio_flagged(bio, BIO_THROTTLED) || !tg->has_rules[rw]) + /* + * see throtl_charge_bio() for BIO_THROTTLED. If a bio is throttled + * against a disk but remapped to other disk, we should throttle it + * again + */ + if (bio_flagged(bio, BIO_THROTTLED) || !tg->has_rules[rw] || + (bio->bi_throttled_disk && bio->bi_throttled_disk == bio->bi_disk)) goto out; + bio->bi_throttled_disk = NULL; spin_lock_irq(q->queue_lock); @@ -2227,8 +2233,11 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg, * don't want bios to leave with the flag set. Clear the flag if * being issued. */ - if (!throttled) + if (!throttled) { bio_clear_flag(bio, BIO_THROTTLED); + /* if the bio is cloned, we don't throttle it again */ + bio->bi_throttled_disk = bio->bi_disk; + } #ifdef CONFIG_BLK_DEV_THROTTLING_LOW if (throttled || !td->track_bio_latency) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 3385c89..2507566 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -89,6 +89,10 @@ struct bio { void *bi_cg_private; struct blk_issue_stat bi_issue_stat; #endif +#ifdef CONFIG_BLK_DEV_THROTTLING + /* record which disk the bio is throttled against */ + struct gendisk *bi_throttled_disk; +#endif #endif union { #if defined(CONFIG_BLK_DEV_INTEGRITY) -- 2.9.5