On 1/19/18 8:20 AM, Bart Van Assche wrote: > On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote: >> Please see queue_delayed_work_on(), hctx->run_work is shared by all >> scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new >> scheduling can make progress during the 100ms. > > How about addressing that as follows: > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index f7515dd95a36..57f8379a476d 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1403,9 +1403,9 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async, > put_cpu(); > } > > - kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx), > - &hctx->run_work, > - msecs_to_jiffies(msecs)); > + kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), > + &hctx->run_work, > + msecs_to_jiffies(msecs)); > } Exactly. That's why I said it was just a bug in my previous email, not honoring a newer run is just stupid. Only other thing you have to be careful with here is the STOPPED bit. -- Jens Axboe