Question about merging raw block device writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In using fio for performance measurements of sequential I/O against a
SCSI disk, I see a big difference in reads vs. writes.  Reads get good
merging, but writes get no merging.  Here are the fio command lines:

fio --bs=4k --ioengine=libaio --iodepth=64 --filename=/dev/sdc --direct=1 --numjobs=8 --rw=read --name=readmerge

fio --bs=4k --ioengine=libaio --iodepth=64 --filename=/dev/sdc --direct=1 --numjobs=8 --rw=write --name=writemerge

/sys/block/sdc/queue/scheduler is set to [none].  Linux kernel is 5.18.5.

The code difference appears to be in blkdev_read_iter() vs.
blkdev_write_iter().  The latter has blk_start_plug()/blk_finish_plug()
while the former does not.  As a result, blk_mq_submit_bio() puts
write requests on the pluglist, which is flushed almost immediately.
blk_mq_submit_bio() sends the read requests through
blk_mq_try_issue_directly(), and they are later merged in a blk-mq
software queue.

For writes, the pluglist flush path is as follows:

blk_finish_plug()
__blk_flush_plug()
blk_mq_flush_plug_list()
blk_mq_plug_issue_direct()
blk_mq_request_issue_directly()
__blk_mq_try_issue_directly()

The last function is called with "bypass_insert" set to true, so if the
request must wait for budget, the request doesn't go on a blk-mq
software queue like reads do, and no merging happens.

I don't know the blk-mq layer well enough to know what's
supposed to be happening.  Is it intentional to not do write merges
in this case? Or did something get broken, and it wasn't noticed?

Michael




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux