Re: Question: t/io_uring performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



What's the advertised peak random read performance of the devices you are using?

I use 2x Intel P4510 (2 TB) for the experiments (and a third SSD for the OS). The SSDs are advertised to have 640k IOPS (4k random reads). So when I get 1.6M IOPS using 2 threads, I already get a lot more than advertised. Still, I wonder why I cannot get that (or at least something like 1.3M IOPS) using a single core. Using 512b blocks should also be able to achieve a bit more than 1.0M IOPS.

Sounds like IRQs are expensive on your box, it does vary quite a bit between systems.

That could definitely be the case, as the processor (EPYC 7702P) seems to have some Numa characteristics even when configuring it to be a single node. With NPS=1, I still get a difference of about 10K-50K IOPS when I use the cores that would belong to different Numa domains than the SSDs. In the measurements above, the interrupts and the benchmark are pinned to a core "near" the SSDs, though.

Did you turn off iostats? If so, then there's a few things in the kernel config that can cause this. One is BLK_CGROUP_IOCOST, is that enabled?

Yes, I did turn off iostats for both drives but BLK_CGROUP_IOCOST is enabled.

Might be more if you're still on that old kernel.

I'm on an old kernel but I am also comparing my results with results that you got on the same kernel back in 2019 (my target is ~1.6M like in [0], not something like the insane 2.5M you got recently [1]). I know that it's not a 100% fair comparison because of the different hardware but I still fear that there is some configuration option that I am missing.

Would be handy to have -g enabled for your perf record and report, since that would show us exactly who's calling the expensive bits.



I did run it with -g (copied the commands from your previous email and just exchanged the pid). You also had the "--no-children" parameter in that command and I guess you were looking for the output without it. You can find the output from a simple "perf report -g" attached.

Thank you again for your help and have a nice day
Hans-Peter

[0]: https://twitter.com/axboe/status/1174777844313911296
[1]: https://lore.kernel.org/io-uring/4af91b50-4a9c-8a16-9470-a51430bd7733@xxxxxxxxx

Attachment: output.gz
Description: application/gzip


[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux