Hi, Andres, On Tue, 2023-12-05 at 22:58 -0800, Andres Freund wrote: > Hi, > > On 2023-12-01 08:31:48 +0000, Zhang, Rui wrote: > > As a quick fix, I'm not going to fix the "potential issue" > > describes > > above because we have not seen a real problem caused by this yet. > > > > Can you please try the below patch to confirm if the problem is > > gone on > > your system? > > This patch falls back to the previous way as sent at > > https://lore.kernel.org/lkml/87pm4bp54z.ffs@tglx/T/ > > > I've just spent a couple hours bisecting why upgrading to 6.7-rc4 > left me with > just a single CPU core on my dual socket workstation. > > > before: > [ 0.000000] Linux version 6.6.0-andres-00003-g31255e072b2e ... > ... > [ 0.022960] ACPI: Using ACPI (MADT) for SMP configuration > information > ... > [ 0.022968] smpboot: Allowing 40 CPUs, 0 hotplug CPUs > ... > [ 0.345921] smpboot: CPU0: Intel(R) Xeon(R) Gold 5215 CPU @ > 2.50GHz (family: 0x6, model: 0x55, stepping: 0x7) > ... > [ 0.347229] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 > #7 #8 #9 > [ 0.349082] .... node #1, CPUs: #10 #11 #12 #13 #14 #15 #16 #17 > #18 #19 > [ 0.003190] smpboot: CPU 10 Converting physical 0 to logical die 1 > > [ 0.361053] .... node #0, CPUs: #20 #21 #22 #23 #24 #25 #26 #27 > #28 #29 > [ 0.363990] .... node #1, CPUs: #30 #31 #32 #33 #34 #35 #36 #37 > #38 #39 > ... > [ 0.370886] smp: Brought up 2 nodes, 40 CPUs > [ 0.370891] smpboot: Max logical packages: 2 > [ 0.370896] smpboot: Total of 40 processors activated (200000.00 > BogoMIPS) > [ 0.403905] node 0 deferred pages initialised in 32ms > [ 0.408865] node 1 deferred pages initialised in 37ms > > > after: > [ 0.000000] Linux version 6.6.0-andres-00004-gec9aedb2aa1a ... > ... > [ 0.022935] ACPI: Using ACPI (MADT) for SMP configuration > information > ... > [ 0.022942] smpboot: Allowing 1 CPUs, 0 hotplug CPUs > ... > [ 0.356424] smpboot: CPU0: Intel(R) Xeon(R) Gold 5215 CPU @ > 2.50GHz (family: 0x6, model: 0x55, stepping: 0x7) > ... > [ 0.357098] smp: Bringing up secondary CPUs ... > [ 0.357107] smp: Brought up 2 nodes, 1 CPU > [ 0.357108] smpboot: Max logical packages: 1 > [ 0.357110] smpboot: Total of 1 processors activated (5000.00 > BogoMIPS) > [ 0.726283] node 0 deferred pages initialised in 368ms > [ 0.774704] node 1 deferred pages initialised in 418ms > > > There does seem to be something off with the ACPI data, when booting > without > the patch, which patch are you referring to? the original patch in this thread? Does the second patch fixes the problem? I mean the patch at https://lore.kernel.org/all/904ce2b870b8a7f34114f93adc7c8170420869d1.camel@xxxxxxxxx/ thanks, rui > I do see messages like: > [ 0.715228] APIC: NR_CPUS/possible_cpus limit of 40 reached. > Processor 40/0x7f00 ignored. > [ 0.715231] ACPI: Unable to map lapic to logical cpu number > > But other than that, the system has worked for a couple years. > > > It's obviously not good to regress from 2x10/20 cores/threads to a > single > core. I guess it's at least somewhat funny to imagine a 2 socket > system with > a single core... > > > It seems particularly worrying that this patch has apparently been > selected > for -stable: > https://lore.kernel.org/all/20231122153212.852040-2-sashal@xxxxxxxxxx/ > > Even if it didn't have these unintended consequences, it seems like a > commit > like this hardly is -stable material? > > > I've attached .config, dmesg of a boot with gec9aedb2aa1a and one > with > gec9aedb2aa1a^. > > Greetings, > > Andres Freund