Re: [PATCH v2] block: reduce kblockd_mod_delayed_work_on() CPU consumption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/15/21 3:25 AM, John Garry wrote:
> On 14/12/2021 20:49, Jens Axboe wrote:
>> Dexuan reports that he's seeing spikes of very heavy CPU utilization when
>> running 24 disks and using the 'none' scheduler. This happens off the
>> sched restart path, because SCSI requires the queue to be restarted async,
>> and hence we're hammering on mod_delayed_work_on() to ensure that the work
>> item gets run appropriately.
>>
>> Avoid hammering on the timer and just use queue_work_on() if no delay
>> has been specified.
>>
>> Reported-and-tested-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
>> Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>
>> ---
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 1378d084c770..c1833f95cb97 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1484,6 +1484,8 @@ EXPORT_SYMBOL(kblockd_schedule_work);
>>   int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
>>   				unsigned long delay)
>>   {
>> +	if (!delay)
>> +		return queue_work_on(cpu, kblockd_workqueue, &dwork->work);
>>   	return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
>>   }
>>   EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
>>
> 
> Hi Jens,
> 
> I have a related comment on the current code and interface it uses, if 
> you don't mind, as I did wonder if we are doing a msec_to_jiffies(0 [not 
> built-in const]) call somewhere.
> 
> So we pass msecs to blk-mq.c, and we do a msec_to_jiffies() call on it 
> before calling kblockd_mod_delayed_work_on(). Now most/all callsites 
> uses const value for the msec value, so if we did the msec_to_jiffies() 
> conversion at the callsites and passed a jiffies value, it should be 
> compiled out by gcc. This is my current __blk_mq_delay_run_hw_queue 
> assembler:
> 
> 0000000000001ef0 <__blk_mq_delay_run_hw_queue>:
>      [snip]
>      2024: a942dfb6 ldp x22, x23, [x29, #40]
>      2028: 2a1503e0 mov w0, w21
>      202c: 94000000 bl 0 <__msecs_to_jiffies>
> kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
>      2030: aa0003e2 mov x2, x0
>      2034: 91010261 add x1, x19, #0x40
>      2038: 2a1403e0 mov w0, w20
>      203c: 94000000 bl 0 <kblockd_mod_delayed_work_on>
> 
> I'm not sure if you would want to change so many APIs or if jiffies is 
> sensible to pass or even any performance gain. Additionally Function 
> blk_mq_delay_kick_requeue_list() would not see so much gain in such a 
> change as msec value is not const. Any thoughts? Maybe testing 
> performance would not do much harm.

In general I totally agree with you, it'd be smarter to flip the
conversion so it can be done in a more efficient manner. At the same
time, the queue delay running is not at all a fast path, so shouldn't
really matter in practice.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux