Re: [PATCH] block: remove plug based merging

Jens Axboe <axboe@xxxxxxxxx> · Thu, 14 Oct 2021 06:50:49 -0600

On 10/13/21 11:20 PM, Christoph Hellwig wrote:
> On Tue, Oct 12, 2021 at 12:12:39PM -0600, Jens Axboe wrote:
>> It's expensive to browse the whole plug list for merge opportunities at
>> the IOPS rates that modern storage can do. For sequential IO, the one-hit
>> cached merge should suffice on fast drives, and for rotational storage the
>> IO scheduler will do a more exhaustive lookup based merge anyway.
> 
> I don't really want to argue, but maybe some actual measurements to
> support the above claims would be useful in the commit log?

Sure, I do need to pad the commit messages a bit. One example - running
an IOPS bound worload caps out at ~5.6M IOPS and if you check the profile
this is top of the list:

  Overhead  Command   Shared Object     Symbol                                         
+   20.89%  io_uring  [kernel.vmlinux]  [k] blk_attempt_plug_merge
+    4.98%  io_uring  [kernel.vmlinux]  [k] io_submit_sqes
+    4.78%  io_uring  [kernel.vmlinux]  [k] blkdev_direct_IO
+    4.61%  io_uring  [kernel.vmlinux]  [k] blk_mq_submit_bio
+    3.67%  io_uring  [nvme]            [k] nvme_queue_rq

Disabling merging or applying the patch, we run at ~7.4M IOPS instead.
That's about a 33% improvement from killing a silly merge loop that
isn't even marked as an expensive merge.

I'll rework the patch, we can probably retain it is a last-insert kind
of merge point. But browsing the whole plug list for a merge candidate
is stupidity, regardless of the class of workload and storage.

-- 
Jens Axboe