On Tue, 2018-01-23 at 10:22 +0100, Mike Snitzer wrote: > On Thu, Jan 18 2018 at 5:20pm -0500, > Bart Van Assche <Bart.VanAssche@xxxxxxx> wrote: > > > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote: > > > And yet Laurence cannot reproduce any such lockups with your test... > > > > Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has > > already succeeded at running an unmodified version of my tests. In one of the > > e-mails Laurence sent me this morning I read that he modified these scripts > > to get past a kernel module unload failure that was reported while starting > > these tests. So the next step is to check which changes were made to the test > > scripts and also whether the test results are still valid. > > > > > Are you absolutely certain this patch doesn't help you? > > > https://patchwork.kernel.org/patch/10174037/ > > > > > > If it doesn't then that is actually very useful to know. > > > > The first I tried this morning is to run the srp-test software against a merge > > of Jens' for-next branch and your dm-4.16 branch. Since I noticed that the dm > > queue locked up I reinserted a blk_mq_delay_run_hw_queue() call in the dm code. > > Since even that was not sufficient I tried to kick the queues via debugfs (for > > s in /sys/kernel/debug/block/*/state; do echo kick >$s; done). Since that was > > not sufficient to resolve the queue stall I reverted the following tree patches > > that are in Jens' tree: > > * "blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback" > > * "blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request" > > * "blk-mq: don't dispatch request in blk_mq_request_direct_issue if queue is busy" > > > > Only after I had done this the srp-test software ran again without triggering > > dm queue lockups. Hello Mike, I want to take back what I wrote about the reverts. I have not yet tried to analyze exactly which change made blk_insert_cloned_request() work reliably for me but with Jens' latest for-next branch and your for-4.16 branch merged all I need to avoid queue stalls is the following: diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index f16096af879a..c59c59cfd2a5 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, /* Undo dm_start_request() before requeuing */ rq_end_stats(md, rq); rq_completed(md, rq_data_dir(rq), false); + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/); return BLK_STS_RESOURCE; } Bart.