Currently, blk_throtl_bio() issues the passed in bio directly if it's within limits of its associated tg (throtl_grp). This behavior becomes incorrect with hierarchy support as the bio should be accounted to and throttled by the ancestor throtl_grps too. This patch makes the direct issue path of blk_throtl_bio() to loop until it reaches the top-level service_queue or gets throttled. If the former, the bio can be issued directly; otherwise, it gets queued at the first layer it was above limits. As tg->parent_sq is always the top-level service queue currently, this patch in itself doesn't make any behavior differences. Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> --- block/blk-throttle.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 8c6e133..52321a4 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1239,12 +1239,16 @@ bool blk_throtl_bio(struct request_queue *q, struct bio *bio) sq = &tg->service_queue; - /* throtl is FIFO - if other bios are already queued, should queue */ - if (sq->nr_queued[rw]) - goto queue_bio; + while (true) { + /* throtl is FIFO - if bios are already queued, should queue */ + if (sq->nr_queued[rw]) + break; - /* Bio is with-in rate limit of group */ - if (tg_may_dispatch(tg, bio, NULL)) { + /* if above limits, break to queue */ + if (!tg_may_dispatch(tg, bio, NULL)) + break; + + /* within limits, let's charge and dispatch directly */ throtl_charge_bio(tg, bio); /* @@ -1259,10 +1263,19 @@ bool blk_throtl_bio(struct request_queue *q, struct bio *bio) * So keep on trimming slice even if bio is not queued. */ throtl_trim_slice(tg, rw); - goto out_unlock; + + /* + * @bio passed through this layer without being throttled. + * Climb up the ladder. If we''re already at the top, it + * can be executed directly. + */ + sq = sq->parent_sq; + tg = sq_to_tg(sq); + if (!tg) + goto out_unlock; } -queue_bio: + /* out-of-limit, queue to @tg */ throtl_log(sq, "[%c] bio. bdisp=%llu sz=%u bps=%llu iodisp=%u iops=%u queued=%d/%d", rw == READ ? 'R' : 'W', tg->bytes_disp[rw], bio->bi_size, tg->bps[rw], -- 1.8.1.4 _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers