On Tue, Feb 22, 2011 at 10:24:26AM -0500, Vivek Goyal wrote: [..] > > - I don't see any throttling messages. They are prefixed by "throtl". So > it seems all this IO is happening in root group. I believe it belongs > to unthrottled VM. So to me it looks that system reached in bad shape > even before throttled VMs were started. You are taking trace of /dev/sdb and not /dev/vdisks/kernel3 etc, hence I don't see the throttle messages. So that's fine. > > - So it sounds more and more like a CFQ issue which happens in conjuction > with throttling. I will try to reproduce it. I tried a lot but I can't reproduce the issue. So now I shall have to rely on data from you. > > - Need little more info about how did you capture the blktrace. So you > started blktrace and then started dd in parallel in all the three > VMs and immediately system freezes and these are the only logs we see > on console? Can you please apply attached patch. This just makes CFQ output little more verbose and run the test again and capture the trace. - Start the trace on /dev/sdb - Start the dd jobs in virt machines - Wait for system to hang - Press CTRL-C - Make sure there were no lost events otherwise increase the size and number of buffers. Can you also open tracing in another window and also trace one of the throttled dm deivces, say /dev/disks/kernel3. Following the same procedure as above. So let the two traces run in parallel. Thanks Vivek --- block/cfq-iosched.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) Index: linux-2.6/block/cfq-iosched.c =================================================================== --- linux-2.6.orig/block/cfq-iosched.c 2011-02-22 13:23:25.000000000 -0500 +++ linux-2.6/block/cfq-iosched.c 2011-02-22 14:01:21.515363676 -0500 @@ -498,7 +498,7 @@ static inline bool cfq_bio_sync(struct b static inline void cfq_schedule_dispatch(struct cfq_data *cfqd) { if (cfqd->busy_queues) { - cfq_log(cfqd, "schedule dispatch"); + cfq_log(cfqd, "schedule dispatch: busy_queues=%d rq_queued=%d rq_in_driver=%d", cfqd->busy_queues, cfqd->rq_queued, cfqd->rq_in_driver); kblockd_schedule_work(cfqd->queue, &cfqd->unplug_work); } } @@ -2229,6 +2229,8 @@ static struct cfq_queue *cfq_select_queu { struct cfq_queue *cfqq, *new_cfqq = NULL; + cfq_log(cfqd, "select_queue: busy_queues=%d rq_queued=%d rq_in_driver=%d", cfqd->busy_queues, cfqd->rq_queued, cfqd->rq_in_driver); + cfqq = cfqd->active_queue; if (!cfqq) goto new_queue; @@ -2499,8 +2501,10 @@ static int cfq_dispatch_requests(struct return cfq_forced_dispatch(cfqd); cfqq = cfq_select_queue(cfqd); - if (!cfqq) + if (!cfqq) { + cfq_log(cfqd, "select: no cfqq selected"); return 0; + } /* * Dispatch a request from this cfqq, if it is allowed @@ -3359,7 +3363,7 @@ static void cfq_insert_request(struct re struct cfq_data *cfqd = q->elevator->elevator_data; struct cfq_queue *cfqq = RQ_CFQQ(rq); - cfq_log_cfqq(cfqd, cfqq, "insert_request"); + cfq_log_cfqq(cfqd, cfqq, "insert_request: busy_queues=%d rq_queued=%d rq_in_driver=%d", cfqd->busy_queues, cfqd->rq_queued, cfqd->rq_in_driver); cfq_init_prio_data(cfqq, RQ_CIC(rq)->ioc); rq_set_fifo_time(rq, jiffies + cfqd->cfq_fifo_expire[rq_is_sync(rq)]); -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list