The idea is to avoid copying, merging, etc. bvec from iterator to bio in direct I/O and use the one we've already got. Hook it up for io_uring. Had an eye on it for a long, and it also was brought up by Matthew just recently. Let me know if I forgot or misplaced some tags. A benchmark got me 430KIOPS vs 540KIOPS, or +25% on bare metal. And perf shows that bio_iov_iter_get_pages() was taking ~20%. The test is pretty silly, but still imposing. I'll redo it closer to reality for next iteration, anyway need to double check some cases. If same applied to iomap, common chunck can be moved from block_dev into bio_iov_iter_get_pages(), but if there any benefit for filesystems, they should explicitly opt in with ITER_BVEC_FLAG_FIXED. # how to apply based on Jens' for-11/block + Ming's nr_vec patch, + io_uring fix, 9c3a205c5ffa36e96903c2 ("io_uring: fix ITER_BVEC check") or there: https://github.com/isilence/linux/commits/bvec_nocopy # how to reproduce null_blk queue_mode=2 completion_nsec=0 submit_queues=NUM_CPU fio/t/io_uring with null blk, no iopoll, BS=16*4096 Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Ming Lei <ming.lei@xxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Pavel Begunkov (2): iov: introduce ITER_BVEC_FLAG_FIXED block: no-copy bvec for direct IO fs/block_dev.c | 30 +++++++++++++++++++++++++++++- fs/io_uring.c | 1 + include/linux/uio.h | 14 +++++++++++--- 3 files changed, 41 insertions(+), 4 deletions(-) -- 2.24.0