On 9/22/21 3:25 PM, Davidlohr Bueso wrote:
On Fri, 14 May 2021, Alex Kogan wrote:
diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation/admin-guide/kernel-parameters.txt
index a816935d23d4..94d35507560c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3515,6 +3515,16 @@
NUMA balancing.
Allowed values are enable and disable
+ numa_spinlock= [NUMA, PV_OPS] Select the NUMA-aware variant
+ of spinlock. The options are:
+ auto - Enable this variant if running on a multi-node
+ machine in native environment.
+ on - Unconditionally enable this variant.
Is there any reason why the user would explicitly pass the on option
when the auto thing already does the multi-node check? Perhaps strange
numa topologies? Otherwise I would say it's not needed and the fewer
options we give the user for low level locking the better.
I asked Alex to put in a command line option because we may want to
disable it on a multi-socket server if we want to.
+ off - Unconditionally disable this variant.
+
+ Not specifying this option is equivalent to
+ numa_spinlock=auto.
+
numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
'node', 'default' can be specified
This can be set from sysctl after boot.
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0045e1b44190..819c3dad8afc 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1564,6 +1564,26 @@ config NUMA
Otherwise, you should say N.
+config NUMA_AWARE_SPINLOCKS
+ bool "Numa-aware spinlocks"
+ depends on NUMA
+ depends on QUEUED_SPINLOCKS
+ depends on 64BIT
+ # For now, we depend on PARAVIRT_SPINLOCKS to make the patching
work.
+ # This is awkward, but hopefully would be resolved once
static_call()
+ # is available.
+ depends on PARAVIRT_SPINLOCKS
We now have static_call() - see 9183c3f9ed7.
I agree that it is now time to look at using the static call for
slowpath switching.
+ default y
+ help
+ Introduce NUMA (Non Uniform Memory Access) awareness into
+ the slow path of spinlocks.
+
+ In this variant of qspinlock, the kernel will try to keep the
lock
+ on the same node, thus reducing the number of remote cache
misses,
+ while trading some of the short term fairness for better
performance.
+
+ Say N if you want absolute first come first serve fairness.
This would also need a depends on !PREEMPT_RT, no? Raw spinlocks
really want
the determinism.
Agreed
Cheers,
Longman