On Thu, Jan 18, 2018 at 05:24:29PM -0700, Jens Axboe wrote: > On 1/18/18 5:18 PM, Ming Lei wrote: > > On Fri, Jan 19, 2018 at 12:14:24AM +0000, Bart Van Assche wrote: > >> On Fri, 2018-01-19 at 08:11 +0800, Ming Lei wrote: > >>> On Thu, Jan 18, 2018 at 08:37:07AM -0800, Bart Van Assche wrote: > >>>> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c > >>>> index f16096af879a..c59c59cfd2a5 100644 > >>>> --- a/drivers/md/dm-rq.c > >>>> +++ b/drivers/md/dm-rq.c > >>>> @@ -761,6 +761,7 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, > >>>> /* Undo dm_start_request() before requeuing */ > >>>> rq_end_stats(md, rq); > >>>> rq_completed(md, rq_data_dir(rq), false); > >>>> + blk_mq_delay_run_hw_queue(hctx, 100/*ms*/); > >>>> return BLK_STS_RESOURCE; > >>>> } > >>>> > >>> > >>> Nak. > >> > >> This patch fixes a regression that was introduced by you. You should know > >> that regressions are not acceptable. If you don't agree with this patch, > >> please fix the root cause. > > > > Yesterday I sent a patch, did you test that? > > That patch isn't going to be of much help. It might prevent you from > completely stalling, but it might take you tens of seconds to get there. Yeah, that is true, and why I marked it as RFC, but the case is so rare to happen. > > On top of that, it's a rolling timer that gets added when IO is queued, > and re-added if IO is pending when it fires. If the latter case is not > true, then it won't get armed again. So if IO fails to queue without > any other IO pending, you're pretty much in the same situation as if No IO pending detection may take a bit time, we can do it after BLK_STS_RESOURCE is returned and SCHED_RESTART is set. I have done this way in the following patch, what do you think of it? https://github.com/ming1/linux/commit/f866ce0b97b0ae22d033881da7eb07706fd458b4 -- Ming