On Wed, Aug 14, 2024 at 10:48:40AM +0300, Tariq Toukan wrote: > > > On 12/08/2024 11:22, Erwan Velu wrote: > > Commit 2acda57736de ("net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints") > > removed the usage of cpumask_local_spread(). > > > > The issue explained in this commit was fixed by > > commit 406d394abfcd ("cpumask: improve on cpumask_local_spread() locality"). > > > > Since this commit, mlx5_cpumask_default_spread() is having the same > > behavior as cpumask_local_spread(). > > > > Adding Yuri. > > One patch led to the other, finally they were all submitted within the same > patchset. > > cpumask_local_spread() indeed improved, and AFAIU is functionally equivalent > to existing logic. > According to [1] the current code is faster. > However, this alone is not a relevant enough argument, as we're talking > about slowpath here. > > Yuri, is that accurate? Is this the only difference? > > If so, I am fine with this change, preferring simplicity. > > [1] https://elixir.bootlin.com/linux/v6.11-rc3/source/lib/cpumask.c#L122 If you end up calling mlx5_cpumask_default_spread() for each CPU, it would be O(N^2). If you call cpumask_local_spread() for each CPU, your complexity would be O(N*logN), because under the hood it uses binary search. The comment you've mentioned says that you can traverse your CPUs in O(N) if you can manage to put all the logic inside the for_each_numa_hop_mask() iterator. It doesn't seem to be your case. I agree with you. mlx5_cpumask_default_spread() should be switched to using library code. Acked-by: Yury Norov <yury.norov@xxxxxxxxx> You may be interested in siblings-aware CPU distribution I've made for mana ethernet driver in 91bfe210e196. This is also an example where using for_each_numa_hop_mask() over simple cpumask_local_spread() is justified. Thanks, Yury