On Wed, Mar 20 2024 at 08:46, Guenter Roeck wrote: > On 3/20/24 01:58, Thomas Gleixner wrote: >> On Fri, Mar 15 2024 at 09:17, Guenter Roeck wrote: >>> I don't know the code well enough to determine what is wrong. >>> Please let me know what I can do to help debugging the problem. >> >> Could you provide me the config and the qemu command line? >> > > defconfig-CONFIG_SMP and > > qemu-system-x86_64 -kernel arch/x86/boot/bzImage -cpu Haswell \ > --append "console=ttyS0" -nographic -monitor none > > The cpu doesn't really matter as long as it is an Intel CPU. > A root file system isn't needed since the boot doesn't get that far. Now it get's interesting because I can't reproduce it with that setup at all. What's weird is that I saw it exactly once on 64-bit in a VM with a UP config two days ago, but when I started to add instrumentation it never came back even after backing the instrumentation changes out. I have seriously no idea what's going on there. Is it fully reproducible on your side? If so can you please provide a full dmesg and then apply the patch below and provide the resulting full dmesg too? I found two other issues while trying to find a way to reproduce, but those are completely unrelated to the problem you are observing. Thanks, tglx --- arch/x86/kernel/cpu/topology.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -176,6 +176,8 @@ static __init void topo_register_apic(u3 { int cpu, dom; + pr_info("APIC: %x %d\n", apic_id, present); + if (present) { set_bit(apic_id, phys_cpu_present_map); @@ -277,10 +279,23 @@ int topology_get_logical_id(u32 apicid, /* Remove the bits below @at_level to get the proper level ID of @apicid */ unsigned int lvlid = topo_apicid(apicid, at_level); - if (lvlid >= MAX_LOCAL_APIC) + pr_info("APIC logical ID: %x %x %d\n", apicid, lvlid, at_level); + + if (WARN_ON_ONCE(lvlid >= MAX_LOCAL_APIC)) return -ERANGE; - if (!test_bit(lvlid, apic_maps[at_level].map)) + + /* + * If there was no APIC registered, then the map check below would + * fail. With no APIC this is guaranteed to be an UP system and + * therefore all topology levels have only one entry and their + * logical ID is obviously 0. + */ + if (topo_info.boot_cpu_apic_id == BAD_APICID) + return 0; + + if (WARN_ON_ONCE(!test_bit(lvlid, apic_maps[at_level].map))) return -ENODEV; + /* Get the number of set bits before @lvlid. */ return bitmap_weight(apic_maps[at_level].map, lvlid); }