Re: [PATCH] net/mlx5: Use cpumask_local_spread() instead of custom code

Tariq Toukan <ttoukan.linux@xxxxxxxxx> · Wed, 14 Aug 2024 10:48:40 +0300

On 12/08/2024 11:22, Erwan Velu wrote:
Commit 2acda57736de ("net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints")
removed the usage of cpumask_local_spread().

The issue explained in this commit was fixed by
commit 406d394abfcd ("cpumask: improve on cpumask_local_spread() locality").

Since this commit, mlx5_cpumask_default_spread() is having the same
behavior as cpumask_local_spread().


Adding Yuri.

One patch led to the other, finally they were all submitted within the 
same patchset.

cpumask_local_spread() indeed improved, and AFAIU is functionally 
equivalent to existing logic.
According to [1] the current code is faster.
However, this alone is not a relevant enough argument, as we're talking 
about slowpath here.

Yuri, is that accurate? Is this the only difference?

If so, I am fine with this change, preferring simplicity.

[1] https://elixir.bootlin.com/linux/v6.11-rc3/source/lib/cpumask.c#L122

This commit is about :
- removing the specific logic and use cpumask_local_spread() instead
- passing mlx5_core_dev as argument to more flexibility

mlx5_cpumask_default_spread() is kept as it could be useful for some
future specific quirks.

Signed-off-by: Erwan Velu <e.velu@xxxxxxxxxx>
---
  drivers/net/ethernet/mellanox/mlx5/core/eq.c | 27 +++-----------------
  1 file changed, 4 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index cb7e7e4104af..f15ecaef1331 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -835,28 +835,9 @@ static void comp_irq_release_pci(struct mlx5_core_dev *dev, u16 vecidx)
  	mlx5_irq_release_vector(irq);
  }
  
-static int mlx5_cpumask_default_spread(int numa_node, int index)
+static int mlx5_cpumask_default_spread(struct mlx5_core_dev *dev, int index)
  {
-	const struct cpumask *prev = cpu_none_mask;
-	const struct cpumask *mask;
-	int found_cpu = 0;
-	int i = 0;
-	int cpu;
-
-	rcu_read_lock();
-	for_each_numa_hop_mask(mask, numa_node) {
-		for_each_cpu_andnot(cpu, mask, prev) {
-			if (i++ == index) {
-				found_cpu = cpu;
-				goto spread_done;
-			}
-		}
-		prev = mask;
-	}
-
-spread_done:
-	rcu_read_unlock();
-	return found_cpu;
+	return cpumask_local_spread(index, dev->priv.numa_node);
  }
  
  static struct cpu_rmap *mlx5_eq_table_get_pci_rmap(struct mlx5_core_dev *dev)
@@ -880,7 +861,7 @@ static int comp_irq_request_pci(struct mlx5_core_dev *dev, u16 vecidx)
  	int cpu;
  
  	rmap = mlx5_eq_table_get_pci_rmap(dev);
-	cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vecidx);
+	cpu = mlx5_cpumask_default_spread(dev, vecidx);
  	irq = mlx5_irq_request_vector(dev, cpu, vecidx, &rmap);
  	if (IS_ERR(irq))
  		return PTR_ERR(irq);
@@ -1145,7 +1126,7 @@ int mlx5_comp_vector_get_cpu(struct mlx5_core_dev *dev, int vector)
  	if (mask)
  		cpu = cpumask_first(mask);
  	else
-		cpu = mlx5_cpumask_default_spread(dev->priv.numa_node, vector);
+		cpu = mlx5_cpumask_default_spread(dev, vector);
  
  	return cpu;
  }