On Tue, Sep 19, 2017 at 11:48:23AM -0400, Mike Snitzer wrote: > On Tue, Sep 19 2017 at 1:43am -0400, > Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > On Mon, Sep 18, 2017 at 03:18:16PM +0000, Bart Van Assche wrote: > > > On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote: > > > > "if no request has completed before the delay has expired" can't be a > > > > reason to rerun the queue, because the queue can still be busy. > > > > > > That statement of you shows that there are important aspects of the SCSI > > > core and dm-mpath driver that you don't understand. > > > > Then can you tell me why blk-mq's SCHED_RESTART can't cover > > the rerun when there are in-flight requests? What is the case > > in which dm-rq can return BUSY and there aren't any in-flight > > requests meantime? > > > > Also you are the author of adding 'blk_mq_delay_run_hw_queue( > > hctx, 100/*ms*/)' in dm-rq, you never explain in commit > > 6077c2d706097c0(dm rq: Avoid that request processing stalls > > sporadically) what the root cause is for your request stall > > and why this patch fixes your issue. Even you don't explain > > why is the delay 100ms? > > > > So it is a workaound, isn't it? > > > > My concern is that it isn't good to add blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) > > in the hot path since it should been covered by SCHED_RESTART > > if there are in-flight requests. > > This thread proves that it is definitely brittle to be relying on fixed > delays like this: > https://patchwork.kernel.org/patch/9703249/ I can't agree more, because no one mentioned the root cause, maybe the request stall has been fixed recently. Keeping the workaound in hotpath is a bit annoying. -- Ming