On Thu, Sep 01 2016 at 8:03pm -0400, Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > On 09/01/2016 04:48 PM, Mike Snitzer wrote: > > On Thu, Sep 01 2016 at 7:17pm -0400, > > Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > >> Sorry that I misread your previous e-mail. After I received your > >> latest e-mail I rebased my tree on top of the devel.bart branch > >> mentioned above. My tests still pass. The only two patches in my > >> tree that are relevant and that are not in the devel.bart branch > >> have been attached to this e-mail. Did your test involve the sd > >> driver? If so, do the attached two patches help? If the sd driver > >> was not involved, can you provide more information about the hang > >> you ran into? The output and log messages generated by the following > >> commands after the hang has been reproduced would be very welcome: > >> * echo w > /proc/sysrq-trigger > >> * (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list}) > > > > sd is used. I'll apply those patches and test, tomorrow, but I'm pretty > > skeptical. > > > > Haven't had any problems with these tests for quite a while. The tests > > I'm running are just those in the mptest testsuite, see: > > https://github.com/snitm/mptest > > > > Running them should be as simple as you doing: > > > > git clone git://github.com/snitm/mptest.git > > cd mptest > > ./runtest > > > > The default is to use dm-mq on scsi-mq ontop of tcmloop. > > > > [ ... ] > > Hello Mike, > > If I run mptest on my setup I can reproduce the hang. But what I see is > that the service-time path selector is in use when the hang is triggered. > I will patch that path selector in the same way as I did with the > queue-length path selector and rerun the test. > > # dmsetup table > 1Linux_scsi_debug_2000: 0 2097152 multipath 3 retain_attached_hw_handler queue_mode mq 1 alua 3 1 service-time 0 1 2 8:128 1 1 service-time 0 1 2 8:144 1 1 service-time 0 1 2 8:112 1 1 > mp: 0 2097152 multipath 3 retain_attached_hw_handler queue_mode mq 1 alua 2 1 queue-length 0 2 1 8:96 1 8:112 1 queue-length 0 2 1 8:128 1 8:144 1 > # (cd /sys/block && grep -a '' dm*/mq/*/{pending,cpu*/rq_list}) | grep -v ':$' > dm-0/mq/0/pending: ffff880358610000 > dm-1/mq/0/pending: ffff880358220200 > dm-1/mq/0/pending: ffff880358220400 In case I haven't been clear: calling blk_mq_freeze_queue() _after_ you've suspended the DM device will only trigger IO that will get re-queued to the DM device's blk-mq queue. So you're creating a livelock since blk_mq_freeze_queue_wait() will not return, see stack from: https://www.redhat.com/archives/dm-devel/2016-September/msg00017.html FYI, the BLK_MQ_S_STOPPED check that you removed from dm_mq_queue_rq() in this commit... http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=devel.bart&id=69eb3e60e099a6117fc754e70eedd504685326ad ...is effectively serving as this when the device is suspended: diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index b5db523..3bc16dc 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -863,6 +863,9 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, dm_put_live_table(md, srcu_idx); } + if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) + return BLK_MQ_RQ_QUEUE_BUSY; + if (ti->type->busy && ti->type->busy(ti)) return BLK_MQ_RQ_QUEUE_BUSY; Which is comparable to what the old .request_fn DM does (see: dm_make_request). Without that type of suspend check the code will go on to call dm-mpath.c:multipath_clone_and_map() which will only result in DM_MAPIO_REQUEUE (and dm_mq_queue_rq returning BLK_MQ_RQ_QUEUE_BUSY in the case when multipath has no paths, via must_push_back_rq()). Anyway, not what we want in general. The goal during DM device suspend is to stop IO from being mapped. With request-based DM suspend we're punting requests back to the block layer's request_queue. So in the case of blk-mq request-based DM: we cannot expect blk_mq_freeze_queue(), during suspend, to complete if requests are getting requeued to the blk-mq queue via BLK_MQ_RQ_QUEUE_BUSY. Mike -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel