Hi Tony, On 11/30/2023 2:43 PM, Tony Luck wrote: > On Thu, Nov 30, 2023 at 01:47:10PM -0800, Reinette Chatre wrote: ... >>> if (!x86_match_cpu(snc_cpu_ids)) >>> return 1; >> >> I understand and welcome this change as motivated by robustness. Apart >> from that, with this being a model specific feature for this particular >> group of systems, it it not clear to me in which scenarios this could >> run on a system where a present CPU does not have access to L3 cache. > > Agreed that on these systems there should always be an L3 cache. Should > I drop the check for "-1"? Please do keep it. I welcome the additional robustness. The static checker I tried did not complain about this but I expect that it is something that could trigger checks. > >>> >>> - node_caches = bitmap_zalloc(nr_node_ids, GFP_KERNEL); >>> + node_caches = bitmap_zalloc(num_online_cpus(), GFP_KERNEL); >> >> Please do take care to take new bitmap size into account in all >> places. From what I can tell there is a later bitmap_weight() call that >> still uses nr_node_ids as size. > > Oops. I was also using num_online_cpus() before cpus_read_lock(), so > things could theoretically change before the bitmap_weight() call. > I switched to using num_present_cpus() in both places. Thanks for catching this. I am not sure if num_present_cpus() is the right choice. I found its comment to say "If HOTPLUG is enabled, then cpu_present_mask varies dynamically ...". num_possible_cpus() seems more appropriate when looking for something that does not change while not holding the hotplug lock. Reading its description more closely also makes me wonder if the later num_online_cpus() != num_present_cpus() should also maybe be num_online_cpus() != num_possible_cpus() ? It seems to more closely match the intention. >>> if (!node_caches) >>> return 1; >>> >>> @@ -1072,10 +1073,13 @@ static __init int snc_get_config(void) >>> >>> for_each_node(node) { >>> cpu = cpumask_first(cpumask_of_node(node)); >>> - if (cpu < nr_cpu_ids) >>> - set_bit(get_cpu_cacheinfo_id(cpu, 3), node_caches); >>> - else >>> + if (cpu < nr_cpu_ids) { >>> + cache_id = get_cpu_cacheinfo_id(cpu, 3); >>> + if (cache_id != -1) >>> + set_bit(cache_id, node_caches); >>> + } else { >>> mem_only_nodes++; >>> + } >>> } >>> cpus_read_unlock(); >>> >> >> Could this code be made even more robust by checking the computed >> snc_nodes_per_l3_cache against the limited actually possible values? >> Forcing it to 1 if something went wrong? > > Added a couple of extra sanity checks. See updated incremental patch > below. Thank you very much. The additional checks look good to me. Reinette