> Paolo Valente and I found an unexpected surprise while investigating > iodepth impact on throughput. Both on my device (ssd samsung 960 pro > nvme m.2 512gb) and his one (ssd plextor px-256m5s), throughput > decreases as the iodepth increases. ... > What we observed is that with "iodepth=1 and ioengine=sync" > throughput > is greater than with "iodepth=8 and ioengine=libaio". For example: > - iodepth=1, sync: 170.942 ± 11.122 [mb/s] > - iodepth=8, async: 161.360 ± 10.165 [mb/s] > > > So, I would like to ask: > 1) Is it normal that throughput drop? (shouldn't device internal > queues > be more filled with a greater iodepth and thus lead to a greater > throughput?) The drive specs are: * 3500 MB/s sequential read * 57.344 MB/s random read, QD1 (14000 IOPS doing random 4 KiB reads) * 1351.68 MB/s random read, QD4 (330000 IOPS doing random 4 KiB reads) so the bigger issue is why it's underachieving by 10x. direct=0 means you're involving the page cache, which distorts results, especially with libaio: iodepth=int Number of I/O units to keep in flight against the file. Note that increasing iodepth beyond 1 will not affect synchronous ioengines (except for small degress when verify_async is in use). Even async engines may impose OS restrictions causing the desired depth not to be achieved. This may happen on Linux when using libaio and not setting direct=1, since buffered IO is not async on that OS. Keep an eye on the IO depth distribution in the fio output to verify that the achieved depth is as expected. Default: 1. If you're trying to test storage device performance, stick with direct=1.