On Tue, Sep 19 2017 at 1:43am -0400, Ming Lei <ming.lei@xxxxxxxxxx> wrote: > On Mon, Sep 18, 2017 at 03:18:16PM +0000, Bart Van Assche wrote: > > On Sun, 2017-09-17 at 20:40 +0800, Ming Lei wrote: > > > "if no request has completed before the delay has expired" can't be a > > > reason to rerun the queue, because the queue can still be busy. > > > > That statement of you shows that there are important aspects of the SCSI > > core and dm-mpath driver that you don't understand. > > Then can you tell me why blk-mq's SCHED_RESTART can't cover > the rerun when there are in-flight requests? What is the case > in which dm-rq can return BUSY and there aren't any in-flight > requests meantime? > > Also you are the author of adding 'blk_mq_delay_run_hw_queue( > hctx, 100/*ms*/)' in dm-rq, you never explain in commit > 6077c2d706097c0(dm rq: Avoid that request processing stalls > sporadically) what the root cause is for your request stall > and why this patch fixes your issue. Even you don't explain > why is the delay 100ms? > > So it is a workaound, isn't it? > > My concern is that it isn't good to add blk_mq_delay_run_hw_queue(hctx, 100/*ms*/) > in the hot path since it should been covered by SCHED_RESTART > if there are in-flight requests. This thread proves that it is definitely brittle to be relying on fixed delays like this: https://patchwork.kernel.org/patch/9703249/ Mike