The Sub-NUMA cluster feature on some Intel processors partitions the CPUs that share an L3 cache into two or more sets. This plays havoc with the Resource Director Technology (RDT) monitoring features. Prior to this patch Intel has advised that SNC and RDT are incompatible. Some of these CPU support an MSR that can partition the RMID counters in the same way. This allows monitoring features to be used. With the caveat that users must be aware that Linux may migrate tasks more frequently between SNC nodes than between "regular" NUMA nodes, so reading counters from all SNC nodes may be needed to get a complete picture of activity for tasks. Cache and memory bandwidth allocation features continue to operate at the scope of the L3 cache. Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx> --- Dropped Peter's "Reviewed-by" from all but parts 5 & 8 since there have been many changes since he provided those. Other changes since v9 (all from Reinette's comments) global s/cpu/CPU/ in commit messages and code comments #1 New test for invalid domain id before calling rdt_find_domain() means that error handling in that function and at all call-sites can be simplified. In pseudo_lock_region_init() use the new enum resctrl_scope for local variable. #2 Include *all* common fields in the rdt_domain_hdr. Defer adding "type" until it is used later in part #3. #3 Fix commit to be specific the only the RDT_RESOURCE_L3 resource is going to have different monitor and control scope. Rename get_domain_from_cpu() -> get_ctrl_domain_from_cpu() Rewrite comment for rdt_find_domains(). Add "type" field to rdt_domain_hdr structure. Delete the /* RDT_RESOURCE_MBA is never mon_capable */ comment. #4 Comment against patch 4, but now fixed in patch #2. cpu_mask is included in common header. #5 No comments. No changes. #6 Fixed missing word s/monitoring on Intel/monitoring on an Intel/ Deleted "A later patch" paragraph. Expanded description how how values are "adjusted" for mon_scale and cache size. Changed type of "snc_nodes_per_l3_cache" to "unsigned int". #7 Expand h/w to hardware (commit and code comments) Remove "earlier commit" reference s/counnter/counter/ Check for offline CPUs and warn user SNC detection may be broken. #8 No comments. No changes. Tony Luck (8): x86/resctrl: Prepare for new domain scope x86/resctrl: Prepare to split rdt_domain structure x86/resctrl: Prepare for different scope for control/monitor operations x86/resctrl: Split the rdt_domain and rdt_hw_domain structures x86/resctrl: Add node-scope to the options for feature scope x86/resctrl: Introduce snc_nodes_per_l3_cache x86/resctrl: Sub NUMA Cluster detection and enable x86/resctrl: Update documentation with Sub-NUMA cluster changes Documentation/arch/x86/resctrl.rst | 23 +- include/linux/resctrl.h | 87 +++-- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/resctrl/internal.h | 66 ++-- arch/x86/kernel/cpu/resctrl/core.c | 411 +++++++++++++++++----- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 58 +-- arch/x86/kernel/cpu/resctrl/monitor.c | 68 ++-- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 26 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 149 ++++---- 9 files changed, 607 insertions(+), 282 deletions(-) base-commit: 5a6a09e97199d6600d31383055f9d43fbbcbe86f -- 2.41.0