Hi Tony, On 8/25/2023 10:49 AM, Tony Luck wrote: > On Fri, Aug 11, 2023 at 10:32:29AM -0700, Reinette Chatre wrote: >> On 7/22/2023 12:07 PM, Tony Luck wrote: ... >>> +static const struct x86_cpu_id snc_cpu_ids[] __initconst = { >>> + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, 0), >>> + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, 0), >>> + X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, 0), >>> + {} >>> +}; >>> + >>> +/* >>> + * There isn't a simple enumeration bit to show whether SNC mode >>> + * is enabled. Look at the ratio of number of NUMA nodes to the >>> + * number of distinct L3 caches. Take care to skip memory-only nodes. >>> + */ >>> +static __init int get_snc_config(void) >>> +{ >>> + unsigned long *node_caches; >>> + int mem_only_nodes = 0; >>> + int cpu, node, ret; >>> + >>> + if (!x86_match_cpu(snc_cpu_ids)) >>> + return 1; >>> + >>> + node_caches = kcalloc(BITS_TO_LONGS(nr_node_ids), sizeof(*node_caches), GFP_KERNEL); >>> + if (!node_caches) >>> + return 1; >>> + >>> + cpus_read_lock(); >>> + for_each_node(node) { >>> + cpu = cpumask_first(cpumask_of_node(node)); >>> + if (cpu < nr_cpu_ids) >>> + set_bit(get_cpu_cacheinfo_id(cpu, 3), node_caches); >>> + else >>> + mem_only_nodes++; >>> + } >>> + cpus_read_unlock(); >> >> I am not familiar with the numa code at all so please correct me >> where I am wrong. I do see that nr_node_ids is initialized with __init code >> so it should be accurate at this point. It looks to me like this initialization >> assumes that at least one CPU per node will be online at the time it is run. >> It is not clear to me that this assumption would always be true. > > Resctrl initialization is kicked off as a late_initcall(). So all CPUs > and devices are fully initialized before this code runs. > > Resctrl can't be moved to an "init" state before CPUs are brought online > because it makes a call to cpuhp_setup_state() to get callbacks for > online/offline CPU events ... that call can't be done early. Apologies but this is not so obvious to me. From what I understand a system need not be booted with all CPUs online. CPUs can be brought online at any time. >>> + >>> + ret = (nr_node_ids - mem_only_nodes) / bitmap_weight(node_caches, nr_node_ids); >>> + kfree(node_caches); >>> + >>> + if (ret > 1) >>> + rdt_resources_all[RDT_RESOURCE_L3].r_resctrl.mon_scope = MON_SCOPE_NODE; >>> + >>> + return ret; >>> +} >>> + Reinette