On 12/19/21 7:58 AM, Jens Axboe wrote: > On 12/18/21 12:02 PM, Jens Axboe wrote: >> On 12/18/21 11:57 AM, Alex Xu (Hello71) wrote: >>> Hi, >>> >>> I recently noticed that between 6441998e2e and 9eaa88c703, I/O became >>> much slower on my machine using ext4 on dm-crypt on NVMe with bfq >>> scheduler. Checking iostat during heavy usage (find / -xdev and fstrim >>> -v /), maximum IOPS had fallen from ~10000 to ~100. Reverting cb2ac2912a >>> ("block: reduce kblockd_mod_delayed_work_on() CPU consumption") resolves >>> the issue. >> >> Hmm interesting. I'll try and see if I can reproduce this and come up >> with a fix. > > I can reproduce this. Alex, can you see if this one helps? Trying to see > if we can hit a happy medium here that avoids hammering on that timer, > but it really depends on what the mix is here of delay with pending, > or no delay with no pending. > > Dexuan, can you test this for your test case too? I'm going to queue > up a revert for -rc6 just in case. This one should be better... diff --git a/block/blk-core.c b/block/blk-core.c index c1833f95cb97..5e9e3c2b7a94 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1481,12 +1481,17 @@ int kblockd_schedule_work(struct work_struct *work) } EXPORT_SYMBOL(kblockd_schedule_work); -int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, - unsigned long delay) +void kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, + unsigned long msecs) { - if (!delay) - return queue_work_on(cpu, kblockd_workqueue, &dwork->work); - return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay); + if (!msecs) { + cancel_delayed_work(dwork); + queue_work_on(cpu, kblockd_workqueue, &dwork->work); + } else { + unsigned long delay = msecs_to_jiffies(msecs); + + mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay); + } } EXPORT_SYMBOL(kblockd_mod_delayed_work_on); diff --git a/block/blk-mq.c b/block/blk-mq.c index 8874a63ae952..95288a98dae1 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1155,8 +1155,7 @@ EXPORT_SYMBOL(blk_mq_kick_requeue_list); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs) { - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, - msecs_to_jiffies(msecs)); + kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, msecs); } EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); @@ -1868,7 +1867,7 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async, } kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work, - msecs_to_jiffies(msecs)); + msecs); } /** diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index bd4370baccca..40748eedddbb 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1159,7 +1159,7 @@ static inline unsigned int block_size(struct block_device *bdev) } int kblockd_schedule_work(struct work_struct *work); -int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay); +void kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long msecs); #define MODULE_ALIAS_BLOCKDEV(major,minor) \ MODULE_ALIAS("block-major-" __stringify(major) "-" __stringify(minor)) -- Jens Axboe