With the no-copy bvec patches and nullb eats blk_bio_segment_split() eats >7%. It adds yet another fast path for it. 8K 16K 32K 64K before: 932 904 868 788 after: 934 919 902 862 Would appreciate if anyone knows off the bat typical queue_max_segments, etc. numbers for NVMe. Pavel Begunkov (2): block: add a function for *segment_split fast path block: add a fast path for seg split of large bio block/blk-merge.c | 107 +++++++++++++++++++++++++++++----------------- 1 file changed, 68 insertions(+), 39 deletions(-) -- 2.24.0