Re: [PATCH v2] block: reduce kblockd_mod_delayed_work_on() CPU consumption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14/12/2021 20:49, Jens Axboe wrote:
Dexuan reports that he's seeing spikes of very heavy CPU utilization when
running 24 disks and using the 'none' scheduler. This happens off the
sched restart path, because SCSI requires the queue to be restarted async,
and hence we're hammering on mod_delayed_work_on() to ensure that the work
item gets run appropriately.

Avoid hammering on the timer and just use queue_work_on() if no delay
has been specified.

Reported-and-tested-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

---

diff --git a/block/blk-core.c b/block/blk-core.c
index 1378d084c770..c1833f95cb97 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1484,6 +1484,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
  int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
  				unsigned long delay)
  {
+	if (!delay)
+		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
  	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
  }
  EXPORT_SYMBOL(kblockd_mod_delayed_work_on);


Hi Jens,

I have a related comment on the current code and interface it uses, if you don't mind, as I did wonder if we are doing a msec_to_jiffies(0 [not built-in const]) call somewhere.

So we pass msecs to blk-mq.c, and we do a msec_to_jiffies() call on it before calling kblockd_mod_delayed_work_on(). Now most/all callsites uses const value for the msec value, so if we did the msec_to_jiffies() conversion at the callsites and passed a jiffies value, it should be compiled out by gcc. This is my current __blk_mq_delay_run_hw_queue assembler:

0000000000001ef0 <__blk_mq_delay_run_hw_queue>:
    [snip]
    2024: a942dfb6 ldp x22, x23, [x29, #40]
    2028: 2a1503e0 mov w0, w21
    202c: 94000000 bl 0 <__msecs_to_jiffies>
kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
    2030: aa0003e2 mov x2, x0
    2034: 91010261 add x1, x19, #0x40
    2038: 2a1403e0 mov w0, w20
    203c: 94000000 bl 0 <kblockd_mod_delayed_work_on>

I'm not sure if you would want to change so many APIs or if jiffies is sensible to pass or even any performance gain. Additionally Function blk_mq_delay_kick_requeue_list() would not see so much gain in such a change as msec value is not const. Any thoughts? Maybe testing performance would not do much harm.

Thanks,
John



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux