Hi, One of the key elements for eeking out the very last bit of performance with io_uring is being able to test your design and improvements. I had a bit of help on that front since Intel got me some Gen2 Optane SSD samples a while back, and I've been using those to guide improvements - and vice versa, to see which changes end up being detrimental to latencies or scalability. I haven't been able to share any numbers on that until now. So without further ado, here's some insight into what is possible with io_uring, and the Linux IO stack, today. Test setup: Kernel: 5.9.0-rc1 System: Intel Ice Lake-SP Next-Gen Xeon (https://www.servethehome.com/intel-ice-lake-sp-next-gen-xeon-architecture-at-hc32/) Storage device: Single Gen2 Optane SSD (https://blocksandfiles.com/2020/08/14/intel-gen-2-optane-details/) Benchmark: t/io_uring from fio Workload: Single thread random 512b O_DIRECT reads Note that this is utilizing a single core in the system, out of the many available. t/io_uring is used for light overhead IO generation, and we're using polled IO with io_uring, and registered buffers and files. 512b IOs are used to keep us well below the bandwidth ceiling. Throughput is easy, IOPS and latency are harder. My goal here was to demonstrate what is possible today with io_uring in terms of efficiency. Results ----------------------------------------------------------- QD128 : 2.58M IOPS per core (34.9 usec avg latency) QD16 : 2.06M IOPS per core ( 6.9 usec avg latency) QD1 : 290K IOPS per core ( 3.4 usec avg latency) Outside of showing what's possible with io_uring today, these results are also a testament to the general Linux IO stack efficiency. The introduction of blk-mq was as much about general efficiency as it was about scalability. That was a design criteria for both blk-mq and io_uring from day 1. Even if you aren't driving millions of IOPS, or using tons of threads/cores, you still care about getting your work done in the shortest amount of time, using the fewest amount of wasted cycles. More to come... -- Jens Axboe