Hi Tony, On 10/31/2023 2:17 PM, Tony Luck wrote: > There isn't a simple hardware bit that indicates whether a CPU is > running in Sub NUMA Cluster (SNC) mode. Infer the state by comparing > the ratio of NUMA nodes to L3 cache instances. > > When SNC mode is detected, reconfigure the RMID counters by updating > the MSR_RMID_SNC_CONFIG MSR on each socket as CPUs are seen. > > Clearing bit zero of the MSR divides the RMIDs and renumbers the ones > on the second SNC node to start from zero. > > Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx> > --- > Changes since v9 > Expand h/w to hardware (commit and code comments) > Remove "earlier commit" reference > s/counnter/counter/ > Check for offline CPUs and warn user SNC detection may be broken. > > arch/x86/include/asm/msr-index.h | 1 + > arch/x86/kernel/cpu/resctrl/core.c | 100 ++++++++++++++++++++++++++++- > 2 files changed, 99 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h > index e3fa9cecd599..4285a5ee81fe 100644 > --- a/arch/x86/include/asm/msr-index.h > +++ b/arch/x86/include/asm/msr-index.h > @@ -1109,6 +1109,7 @@ > #define MSR_IA32_QM_CTR 0xc8e > #define MSR_IA32_PQR_ASSOC 0xc8f > #define MSR_IA32_L3_CBM_BASE 0xc90 > +#define MSR_RMID_SNC_CONFIG 0xca0 > #define MSR_IA32_L2_CBM_BASE 0xd10 > #define MSR_IA32_MBA_THRTL_BASE 0xd50 > > diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c > index 97d2a5a7dd41..034f9797e1fb 100644 > --- a/arch/x86/kernel/cpu/resctrl/core.c > +++ b/arch/x86/kernel/cpu/resctrl/core.c > @@ -16,11 +16,14 @@ > > #define pr_fmt(fmt) "resctrl: " fmt > > +#include <linux/cpu.h> > #include <linux/slab.h> > #include <linux/err.h> > #include <linux/cacheinfo.h> > #include <linux/cpuhotplug.h> > +#include <linux/mod_devicetable.h> > > +#include <asm/cpu_device_id.h> > #include <asm/intel-family.h> > #include <asm/resctrl.h> > #include "internal.h" > @@ -184,10 +187,10 @@ bool is_mba_sc(struct rdt_resource *r) > > /* > * rdt_get_mb_table() - get a mapping of bandwidth(b/w) percentage values > - * exposed to user interface and the h/w understandable delay values. > + * exposed to user interface and the hardware understandable delay values. > * > * The non-linear delay values have the granularity of power of two > - * and also the h/w does not guarantee a curve for configured delay > + * and also the hardware does not guarantee a curve for configured delay > * values vs. actual b/w enforced. > * Hence we need a mapping that is pre calibrated so the user can > * express the memory b/w as a percentage value. This seems out of place here. If you want to make such a global change it can be done as a separate patch. For this work it can just consistently use "hardware" in own areas changed. Reinette