Currently, when iomap and block direct IO gets a bvec based iterator the bvec will be copied, with all other accounting that takes much CPU time and causes additional allocation for larger bvecs. The patchset makes it to reuse the passed in iter bvec. [1,2] are forbidding zero-length bvec segments to not pile special cases, [3] skip/fix PSI tracking to not iterate over bvecs extra time. nullblk completion_nsec=0 submit_queues=NR_CORES, no merges, no stats fio/t/io_uring /dev/nullb0 -d 128 -s 32 -c 32 -p 0 -B 1 -F 1 -b BLOCK_SIZE BLOCK_SIZE 512 4K 8K 16K 32K 64K =================================================================== old (KIOPS) 1208 1208 1131 1039 863 699 new (KIOPS) 1222 1222 1170 1137 1083 982 Previously, Jens got before 10% difference for polling real HW and small block sizes, but that was for an older version that had one iov_iter_advance() less since RFC: - add target_core_file patch by Christoph - make no-copy default behaviour, remove iter flag - iter_advance() instead of hacks to revert to work - add bvec iter_advance() optimisation patch - remove PSI annotations from direct IO (iomap, block and fs/direct) - note in d/f/porting since v1: - don't allow zero-length bvec segments (Ming) - don't add a BIO_WORKINGSET-less version of bio_add_page(), just clear the flag at the end and leave it for further cleanups (Christoph) - commit message and comments rewording (Dave) - other nits by Christoph since v2: - add a comment in 1/7 (Christoph) - add a note about 0-len bvecs in biovecs.rst (Matthew) Christoph Hellwig (1): target/file: allocate the bvec array as part of struct target_core_file_cmd Pavel Begunkov (6): splice: don't generate zero-len segement bvecs bvec/iter: disallow zero-length segment bvecs block/psi: remove PSI annotations from direct IO iov_iter: optimise bvec iov_iter_advance() bio: add a helper calculating nr segments to alloc bio: don't copy bvec for direct IO Documentation/block/biovecs.rst | 2 + Documentation/filesystems/porting.rst | 16 ++++++ block/bio.c | 71 +++++++++++++-------------- drivers/target/target_core_file.c | 20 +++----- fs/block_dev.c | 7 +-- fs/direct-io.c | 2 + fs/iomap/direct-io.c | 9 ++-- fs/splice.c | 9 ++-- include/linux/bio.h | 13 +++++ lib/iov_iter.c | 21 +++++++- 10 files changed, 106 insertions(+), 64 deletions(-) -- 2.24.0