On 3/2/20 4:55 PM, Bijan Mottahedeh wrote: > I'm seeing a sizeable drop in perf with polled fio tests for block sizes > > 128k: > > filename=/dev/nvme0n1 > rw=randread > direct=1 > time_based=1 > randrepeat=1 > gtod_reduce=1 > > fio --readonly --ioengine=io_uring --iodepth 1024 --fixedbufs --hipri > --numjobs=16 > fio --readonly --ioengine=pvsync2 --iodepth 1024 --hipri --numjobs=16 > > > Compared with the pvsync2 engine, the only major difference I could see > was the dio path, __blkdev_direct_IO() for io_uring vs. > __blkdev_direct_IO_simple() for pvsync2 because of the is_sync_kiocb() > check. > > > static ssize_t > blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter) > { > ... > if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES) > return __blkdev_direct_IO_simple(iocb, iter, nr_pages); > > return __blkdev_direct_IO(iocb, iter, min(nr_pages, > BIO_MAX_PAGES)); > } > > Just for an experiment, I hacked io_uring code to force it through the > _simple() path and I get better numbers though the variance is fairly > high, but the drop at bs > 128k seems consistent: > > > # baseline > READ: bw=3167MiB/s (3321MB/s), 186MiB/s-208MiB/s (196MB/s-219MB/s) #128k > READ: bw=898MiB/s (941MB/s), 51.2MiB/s-66.1MiB/s (53.7MB/s-69.3MB/s) #144k > READ: bw=1576MiB/s (1652MB/s), 81.8MiB/s-109MiB/s (85.8MB/s-114MB/s) #256k > > # hack > READ: bw=2705MiB/s (2836MB/s), 157MiB/s-174MiB/s (165MB/s-183MB/s) #128k > READ: bw=2901MiB/s (3042MB/s), 174MiB/s-194MiB/s (183MB/s-204MB/s) #144k > READ: bw=4194MiB/s (4398MB/s), 252MiB/s-271MiB/s (265MB/s-284MB/s) #256k A quick guess would be that the IO is being split above 128K, and hence the polling only catches one of the parts? -- Jens Axboe