Re: 3.6-rc5 cgroups blkio throttle + md regression

Joseph Glanville <joseph.glanville@xxxxxxxxxxxxxx> · Thu, 20 Sep 2012 04:20:42 +1000

Hi,

I booted the machine under bare metal to continue bisecting.
Thankfully this allowed me to locate the commit that causes the
problem.

git bisect reports this as the offending commit:
commit b82d4b197c782ced82a8b7b76664125d2d3c156c
Author: Tejun Heo <tj@xxxxxxxxxx>
Date:   Fri Apr 13 13:11:31 2012 -0700

    blkcg: make request_queue bypassing on allocation

    With the previous change to guarantee bypass visiblity for RCU read
    lock regions, entering bypass mode involves non-trivial overhead and
    future changes are scheduled to make use of bypass mode during init
    path.  Combined it may end up adding noticeable delay during boot.

    This patch makes request_queue start its life in bypass mode, which is
    ended on queue init completion at the end of
    blk_init_allocated_queue(), and updates blk_queue_bypass_start() such
    that draining and RCU synchronization are performed only when the
    queue actually enters bypass mode.

    This avoids unnecessarily switching in and out of bypass mode during
    init avoiding the overhead and any nasty surprises which may step from
    leaving bypass mode on half-initialized queues.

    The boot time overhead was pointed out by Vivek.

    Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
    Cc: Vivek Goyal <vgoyal@xxxxxxxxxx>
    Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

diff --git a/block/blk-core.c b/block/blk-core.c
index f2db628..3b02ba3 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -421,14 +421,18 @@ void blk_drain_queue(struct request_queue *q,
bool drain_all)
  */
 void blk_queue_bypass_start(struct request_queue *q)
 {
+	bool drain;
+
 	spin_lock_irq(q->queue_lock);
-	q->bypass_depth++;
+	drain = !q->bypass_depth++;
 	queue_flag_set(QUEUE_FLAG_BYPASS, q);
 	spin_unlock_irq(q->queue_lock);

-	blk_drain_queue(q, false);
-	/* ensure blk_queue_bypass() is %true inside RCU read lock */
-	synchronize_rcu();
+	if (drain) {
+		blk_drain_queue(q, false);
+		/* ensure blk_queue_bypass() is %true inside RCU read lock */
+		synchronize_rcu();
+	}
 }
 EXPORT_SYMBOL_GPL(blk_queue_bypass_start);

@@ -577,6 +581,15 @@ struct request_queue *blk_alloc_queue_node(gfp_t
gfp_mask, int node_id)
 	 */
 	q->queue_lock = &q->__queue_lock;

+	/*
+	 * A queue starts its life with bypass turned on to avoid
+	 * unnecessary bypass on/off overhead and nasty surprises during
+	 * init.  The initial bypass will be finished at the end of
+	 * blk_init_allocated_queue().
+	 */
+	q->bypass_depth = 1;
+	__set_bit(QUEUE_FLAG_BYPASS, &q->queue_flags);
+
 	if (blkcg_init_queue(q))
 		goto fail_id;

@@ -672,15 +685,15 @@ blk_init_allocated_queue(struct request_queue
*q, request_fn_proc *rfn,

 	q->sg_reserved_size = INT_MAX;

-	/*
-	 * all done
-	 */
-	if (!elevator_init(q, NULL)) {
-		blk_queue_congestion_threshold(q);
-		return q;
-	}
+	/* init elevator */
+	if (elevator_init(q, NULL))
+		return NULL;

-	return NULL;
+	blk_queue_congestion_threshold(q);
+
+	/* all done, end the initial bypass */
+	blk_queue_bypass_end(q);
+	return q;
 }
 EXPORT_SYMBOL(blk_init_allocated_queue);


Reverting this commit fixes the regression. :)

Joseph.

-- 
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html