Hi, I booted the machine under bare metal to continue bisecting. Thankfully this allowed me to locate the commit that causes the problem. git bisect reports this as the offending commit: commit b82d4b197c782ced82a8b7b76664125d2d3c156c Author: Tejun Heo <tj@xxxxxxxxxx> Date: Fri Apr 13 13:11:31 2012 -0700 blkcg: make request_queue bypassing on allocation With the previous change to guarantee bypass visiblity for RCU read lock regions, entering bypass mode involves non-trivial overhead and future changes are scheduled to make use of bypass mode during init path. Combined it may end up adding noticeable delay during boot. This patch makes request_queue start its life in bypass mode, which is ended on queue init completion at the end of blk_init_allocated_queue(), and updates blk_queue_bypass_start() such that draining and RCU synchronization are performed only when the queue actually enters bypass mode. This avoids unnecessarily switching in and out of bypass mode during init avoiding the overhead and any nasty surprises which may step from leaving bypass mode on half-initialized queues. The boot time overhead was pointed out by Vivek. Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> Cc: Vivek Goyal <vgoyal@xxxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> diff --git a/block/blk-core.c b/block/blk-core.c index f2db628..3b02ba3 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -421,14 +421,18 @@ void blk_drain_queue(struct request_queue *q, bool drain_all) */ void blk_queue_bypass_start(struct request_queue *q) { + bool drain; + spin_lock_irq(q->queue_lock); - q->bypass_depth++; + drain = !q->bypass_depth++; queue_flag_set(QUEUE_FLAG_BYPASS, q); spin_unlock_irq(q->queue_lock); - blk_drain_queue(q, false); - /* ensure blk_queue_bypass() is %true inside RCU read lock */ - synchronize_rcu(); + if (drain) { + blk_drain_queue(q, false); + /* ensure blk_queue_bypass() is %true inside RCU read lock */ + synchronize_rcu(); + } } EXPORT_SYMBOL_GPL(blk_queue_bypass_start); @@ -577,6 +581,15 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id) */ q->queue_lock = &q->__queue_lock; + /* + * A queue starts its life with bypass turned on to avoid + * unnecessary bypass on/off overhead and nasty surprises during + * init. The initial bypass will be finished at the end of + * blk_init_allocated_queue(). + */ + q->bypass_depth = 1; + __set_bit(QUEUE_FLAG_BYPASS, &q->queue_flags); + if (blkcg_init_queue(q)) goto fail_id; @@ -672,15 +685,15 @@ blk_init_allocated_queue(struct request_queue *q, request_fn_proc *rfn, q->sg_reserved_size = INT_MAX; - /* - * all done - */ - if (!elevator_init(q, NULL)) { - blk_queue_congestion_threshold(q); - return q; - } + /* init elevator */ + if (elevator_init(q, NULL)) + return NULL; - return NULL; + blk_queue_congestion_threshold(q); + + /* all done, end the initial bypass */ + blk_queue_bypass_end(q); + return q; } EXPORT_SYMBOL(blk_init_allocated_queue); Reverting this commit fixes the regression. :) Joseph. -- CTO | Orion Virtualisation Solutions | www.orionvm.com.au Phone: 1300 56 99 52 | Mobile: 0428 754 846 -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html