Re: [PATCH] block: Optimize bio_init()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/11/21 9:19 PM, Bart Van Assche wrote:
> On 9/11/21 15:16, Jens Axboe wrote:
>> Looking at profile:
>>
>>   43.34 │      rep    stos %rax,%es:(%rdi)
>> I do wonder if rep stos is just not very well suited for small regions,
>> either in general or particularly on AMD.
>>
>> What do your profiles look like for before and after?
> 
> Since I do not know which tool was used to obtain the above
> information, I ran perf record -ags sleep 10 while the test
> was running. I could not find bio_init in the output. I think
> that means that that function got inlined. But
> bio_alloc_bioset() showed up in the output. The time spent in
> that function is lower if IOPS are higher.

The above is from perf report, diving into the functions. Yours show up
in bio_alloc_bioset(), and mine in bio_alloc_kiocb() as I'm doing polled
IO.

> The performance numbers in the patch description come from a
> Intel Xeon Gold 6154 CPU. I reran the test today on an old Intel
> Core i7-4790 CPU and obtained the opposite result: higher IOPS
> without this patch than with this patch although the assembler
> code looks to be the same. It seems like how fast "rep stos"
> runs depends on the CPU type?

It does appear so. Which is a bit frustrating...

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux