On Thu, 2021-05-27 at 12:31 +0200, Borislav Petkov wrote: > Ok, > > it took me a while to find a box like yours to reproduce on. Anyway, > here's what looks like the final fix, you could give it a run. > > Thx. > > --- > From: Borislav Petkov <bp@xxxxxxx> > Date: Thu, 27 May 2021 11:02:26 +0200 > > There are machines out there with added value crap^WBIOS which > provide an > SMI handler for the local APIC thermal sensor interrupt. Out of > reset, > the BSP on those machines has something like 0x200 in that APIC > register > (timestamps left in because this whole issue is timing sensitive): > > [ 0.033858] read lvtthmr: 0x330, val: 0x200 > > which means: > > - bit 16 - the interrupt mask bit is clear and thus that interrupt > is enabled > - bits [10:8] have 010b which means SMI delivery mode. > > Now, later during boot, when the kernel programs the local APIC, it > soft-disables it temporarily through the spurious vector register: > > setup_local_APIC: > > ... > > /* > * If this comes from kexec/kcrash the APIC might be enabled in > * SPIV. Soft disable it before doing further initialization. > */ > value = apic_read(APIC_SPIV); > value &= ~APIC_SPIV_APIC_ENABLED; > apic_write(APIC_SPIV, value); > > which means (from the SDM): > > "10.4.7.2 Local APIC State After It Has Been Software Disabled > > ... > > * The mask bits for all the LVT entries are set. Attempts to reset > these > bits will be ignored." > > And this happens too: > > [ 0.124111] APIC: Switch to symmetric I/O mode setup > [ 0.124117] lvtthmr 0x200 before write 0xf to APIC 0xf0 > [ 0.124118] lvtthmr 0x10200 after write 0xf to APIC 0xf0 > > This results in CPU 0 soft lockups depending on the placement in time > when the APIC soft-disable happens. Those soft lockups are not 100% > reproducible and the reason for that can only be speculated as no one > tells you what SMM does. Likely, it confuses the SMM code that the > APIC > is disabled and the thermal interrupt doesn't doesn't fire at all, My guess is that system is booting hot sometimes. SMM started fan or some cooling and set a temperature threshold. It is waiting for thermal interrupt for temperature threshold, which it never got. Thanks, Srinivas > leading to CPU 0 stuck in SMM forever... > > Now, before > > 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()") > > due to how the APIC_LVTTHMR was read before APIC initialization in > mcheck_intel_therm_init(), it would read the value with the mask bit > 16 > clear and then intel_init_thermal() would replicate it onto the APs > and > all would be peachy - the thermal interrupt would remain enabled. > > But that commit moved that reading to a later moment in > intel_init_thermal(), resulting in reading APIC_LVTTHMR on the BSP > too > late and with its interrupt mask bit set. > > Thus, revert back to the old behavior of reading the thermal LVT > register before the APIC gets initialized. > > Fixes: 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()") > Reported-by: James Feeney <james@xxxxxxxxxxx> > Signed-off-by: Borislav Petkov <bp@xxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > Cc: Zhang Rui <rui.zhang@xxxxxxxxx> > Cc: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx> > Link: https://lkml.kernel.org/r/YKIqDdFNaXYd39wz@xxxxxxx > --- > arch/x86/include/asm/thermal.h | 4 +++- > arch/x86/kernel/setup.c | 9 +++++++++ > drivers/thermal/intel/therm_throt.c | 15 +++++++++++---- > 3 files changed, 23 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/include/asm/thermal.h > b/arch/x86/include/asm/thermal.h > index ddbdefd5b94f..91a7b6687c3b 100644 > --- a/arch/x86/include/asm/thermal.h > +++ b/arch/x86/include/asm/thermal.h > @@ -3,11 +3,13 @@ > #define _ASM_X86_THERMAL_H > > #ifdef CONFIG_X86_THERMAL_VECTOR > +void therm_lvt_init(void); > void intel_init_thermal(struct cpuinfo_x86 *c); > bool x86_thermal_enabled(void); > void intel_thermal_interrupt(void); > #else > -static inline void intel_init_thermal(struct cpuinfo_x86 *c) { } > +static inline void therm_lvt_init(void) > { } > +static inline void intel_init_thermal(struct cpuinfo_x86 *c) { } > #endif > > #endif /* _ASM_X86_THERMAL_H */ > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 72920af0b3c0..ff653d608d5f 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -44,6 +44,7 @@ > #include <asm/pci-direct.h> > #include <asm/prom.h> > #include <asm/proto.h> > +#include <asm/thermal.h> > #include <asm/unwind.h> > #include <asm/vsyscall.h> > #include <linux/vmalloc.h> > @@ -1226,6 +1227,14 @@ void __init setup_arch(char **cmdline_p) > > x86_init.timers.wallclock_init(); > > + /* > + * This needs to run before setup_local_APIC() which soft- > disables the > + * local APIC temporarily and that masks the thermal LVT > interrupt, > + * leading to softlockups on machines which have configured SMI > + * interrupt delivery. > + */ > + therm_lvt_init(); > + > mcheck_init(); > > register_refined_jiffies(CLOCK_TICK_RATE); > diff --git a/drivers/thermal/intel/therm_throt.c > b/drivers/thermal/intel/therm_throt.c > index f8e882592ba5..99abdc03c44c 100644 > --- a/drivers/thermal/intel/therm_throt.c > +++ b/drivers/thermal/intel/therm_throt.c > @@ -621,6 +621,17 @@ bool x86_thermal_enabled(void) > return atomic_read(&therm_throt_en); > } > > +void __init therm_lvt_init(void) > +{ > + /* > + * This function is only called on boot CPU. Save the init > thermal > + * LVT value on BSP and use that value to restore APs' thermal > LVT > + * entry BIOS programmed later > + */ > + if (intel_thermal_supported(&boot_cpu_data)) > + lvtthmr_init = apic_read(APIC_LVTTHMR); > +} > + > void intel_init_thermal(struct cpuinfo_x86 *c) > { > unsigned int cpu = smp_processor_id(); > @@ -630,10 +641,6 @@ void intel_init_thermal(struct cpuinfo_x86 *c) > if (!intel_thermal_supported(c)) > return; > > - /* On the BSP? */ > - if (c == &boot_cpu_data) > - lvtthmr_init = apic_read(APIC_LVTTHMR); > - > /* > * First check if its enabled already, in which case there > might > * be some SMM goo which handles it, so we can't even put a > handler > -- > 2.29.2 > >