Re: don't reorder requests passed to ->queue_rqs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/13/24 7:20 AM, Christoph Hellwig wrote:
currently blk-mq reorders requests when adding them to the plug because
the request list can't do efficient tail appends.  When the plug is
directly issued using ->queue_rqs that means reordered requests are
passed to the driver, which can lead to very bad I/O patterns when
not corrected, especially on rotational devices (e.g. NVMe HDD) or
when using zone append.

This series first adds two easily backportable workarounds to reverse
the reording in the virtio_blk and nvme-pci ->queue_rq implementations
similar to what the non-queue_rqs path does, and then adds a rq_list
type that allows for efficient tail insertions and uses that to fix
the reordering for real and then does the same for I/O completions as
well.

Hi Christoph,

Could something like the patch below replace this patch series? I don't have a strong opinion about which approach to select.
I'm sharing this patch because this is what I came up while looking into
how to support QD>1 for zoned devices with a storage controller that
preserves the request order (UFS).

Thanks,

Bart.


block: Make the plugging mechanism preserve the request order

Requests are added to the front of the plug list and dispatching happens
in list order. Hence, dispatching happens in reverse order. Dispatch in
order by reversing the plug list before dispatching. This patch is a
modified version of a patch from Jens
(https://lore.kernel.org/linux-block/1872ae0a-6ba6-45f5-9f3d-8451ce06eb14@xxxxxxxxx/).

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3533bd808072..bf2ea421b2e8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2868,6 +2868,22 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
 	percpu_ref_put(&this_hctx->queue->q_usage_counter);
 }

+/* See also llist_reverse_order(). */
+static void blk_plug_reverse_order(struct blk_plug *plug)
+{
+	struct request *rq = plug->mq_list, *new_head = NULL;
+
+	while (rq) {
+		struct request *tmp = rq;
+
+		rq = rq->rq_next;
+		tmp->rq_next = new_head;
+		new_head = tmp;
+	}
+
+	plug->mq_list = new_head;
+}
+
 void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 {
 	struct request *rq;
@@ -2885,6 +2901,8 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 	depth = plug->rq_count;
 	plug->rq_count = 0;

+	blk_plug_reverse_order(plug);
+
 	if (!plug->multiple_queues && !plug->has_elevator && !from_schedule) {
 		struct request_queue *q;






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux