Re: [PATCH mlx5-next] RDMA/mlx5: Don't use cached IRQ affinity mask

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




[ 2032.194376] nvme nvme0: failed to connect queue: 9 ret=-18

queue 9 is not mapped (overlap).
please try the bellow:


This seems to work.  Here are three mapping cases:  each vector on its
own cpu, each vector on 1 cpu within the local numa node, and each
vector having all cpus in its numa node.  The 2nd mapping looks kinda
funny, but I think it achieved what you wanted?  And all the cases
resulted in successful connections.


Thanks for testing this.
I slightly improved the setting of the left CPUs and actually used Sagi's initial proposal.

Sagi,
please review the attached patch and let me know if I should add your signature on it. I'll run some perf test early next week on it (meanwhile I run login/logout with different num_queues successfully and irq settings).

Steve,
It will be great if you can apply the attached in your system and send your findings.

Regards,
Max,
From 6f7b98f1c43252f459772390c178fc3ad043fc82 Mon Sep 17 00:00:00 2001
From: Max Gurtovoy <maxg@xxxxxxxxxxxx>
Date: Thu, 19 Jul 2018 12:42:00 +0000
Subject: [PATCH 1/1] blk-mq: fix RDMA queue/cpu mappings assignments for mq

In order to fulfil the block layer cpu <-> queue mapping, all the
allocated queues and all the possible CPUs should be mapped. First,
try to map the queues according to the affinity hint from the underlying
RDMA device. Second, map all the unmapped queues in a naive way to unmapped
CPU. In case we still have unmapped CPUs, use the default blk-mq mappings
to map the rest. This way we guarantee that no matter what is the underlying
affinity, all the possible CPUs and all the allocated block queues will be
mapped.

Signed-off-by: Max Gurtovoy <maxg@xxxxxxxxxxxx>
---
 block/blk-mq-cpumap.c  | 41 ++++++++++++++++++++++++-----------------
 block/blk-mq-rdma.c    | 44 ++++++++++++++++++++++++++++++++++++++++++--
 include/linux/blk-mq.h |  1 +
 3 files changed, 67 insertions(+), 19 deletions(-)

diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
index 3eb169f..02b888f 100644
--- a/block/blk-mq-cpumap.c
+++ b/block/blk-mq-cpumap.c
@@ -30,29 +30,36 @@ static int get_first_sibling(unsigned int cpu)
 	return cpu;
 }
 
-int blk_mq_map_queues(struct blk_mq_tag_set *set)
+void blk_mq_map_queue_to_cpu(struct blk_mq_tag_set *set, unsigned int cpu)
 {
 	unsigned int *map = set->mq_map;
 	unsigned int nr_queues = set->nr_hw_queues;
-	unsigned int cpu, first_sibling;
+	unsigned int first_sibling;
 
-	for_each_possible_cpu(cpu) {
-		/*
-		 * First do sequential mapping between CPUs and queues.
-		 * In case we still have CPUs to map, and we have some number of
-		 * threads per cores then map sibling threads to the same queue for
-		 * performace optimizations.
-		 */
-		if (cpu < nr_queues) {
+	/*
+	 * First do sequential mapping between CPUs and queues.
+	 * In case we still have CPUs to map, and we have some number of
+	 * threads per cores then map sibling threads to the same queue for
+	 * performace optimizations.
+	 */
+	if (cpu < nr_queues) {
+		map[cpu] = cpu_to_queue_index(nr_queues, cpu);
+	} else {
+		first_sibling = get_first_sibling(cpu);
+		if (first_sibling == cpu)
 			map[cpu] = cpu_to_queue_index(nr_queues, cpu);
-		} else {
-			first_sibling = get_first_sibling(cpu);
-			if (first_sibling == cpu)
-				map[cpu] = cpu_to_queue_index(nr_queues, cpu);
-			else
-				map[cpu] = map[first_sibling];
-		}
+		else
+			map[cpu] = map[first_sibling];
 	}
+}
+EXPORT_SYMBOL_GPL(blk_mq_map_queue_to_cpu);
+
+int blk_mq_map_queues(struct blk_mq_tag_set *set)
+{
+	unsigned int cpu;
+
+	for_each_possible_cpu(cpu)
+		blk_mq_map_queue_to_cpu(set, cpu);
 
 	return 0;
 }
diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c
index 996167f..10e4f8a 100644
--- a/block/blk-mq-rdma.c
+++ b/block/blk-mq-rdma.c
@@ -34,14 +34,54 @@ int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set,
 {
 	const struct cpumask *mask;
 	unsigned int queue, cpu;
+	bool mapped;
 
+	/* reset all CPUs mapping */
+	for_each_possible_cpu(cpu)
+		set->mq_map[cpu] = UINT_MAX;
+
+	/* Try to map the queues according to affinity */
 	for (queue = 0; queue < set->nr_hw_queues; queue++) {
 		mask = ib_get_vector_affinity(dev, first_vec + queue);
 		if (!mask)
 			goto fallback;
 
-		for_each_cpu(cpu, mask)
-			set->mq_map[cpu] = queue;
+		for_each_cpu(cpu, mask) {
+			if (set->mq_map[cpu] == UINT_MAX) {
+				set->mq_map[cpu] = queue;
+				/* Initialy each queue mapped to 1 cpu */
+				break;
+			}
+		}
+	}
+
+	/* Map the unmapped queues in a naive way */
+	for (queue = 0; queue < set->nr_hw_queues; queue++) {
+		mapped = false;
+		for_each_possible_cpu(cpu) {
+			if (set->mq_map[cpu] == queue) {
+				mapped = true;
+				break;
+			}
+		}
+		if (!mapped) {
+			for_each_possible_cpu(cpu) {
+				if (set->mq_map[cpu] == UINT_MAX) {
+					set->mq_map[cpu] = queue;
+					mapped = true;
+					break;
+				}
+			}
+		}
+		/* This case should never happen */
+		if (WARN_ON_ONCE(!mapped))
+			goto fallback;
+	}
+
+	/* set all the rest of the CPUs */
+	for_each_possible_cpu(cpu) {
+		if (set->mq_map[cpu] == UINT_MAX)
+			blk_mq_map_queue_to_cpu(set, cpu);
 	}
 
 	return 0;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e3147eb..d6cd114 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -282,6 +282,7 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 int blk_mq_freeze_queue_wait_timeout(struct request_queue *q,
 				     unsigned long timeout);
 
+void blk_mq_map_queue_to_cpu(struct blk_mq_tag_set *set, unsigned int cpu);
 int blk_mq_map_queues(struct blk_mq_tag_set *set);
 void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues);
 
-- 
1.8.3.1


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux