Hi Thomas, On Wed, 2016-11-09 at 12:27 -0800, tip-bot for Thomas Gleixner wrote: > Commit-ID: d49597fd3bc7d9534de55e9256767f073be1b33a > Gitweb: https://urldefense.proofpoint.com/v2/url?u=http-3A__git.kernel.org_tip_d49597fd3bc7d9534de55e9256767f073be1b33a&d=CwIDaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=2AkLWShm6V8Nuu8ZZ-80Flo6y0XxCGmO1xrsAeRArAE&m=WBsB4JFr-Dct0um4Kf8QAxC7w6p-Mlk3H-LwItQJ7Fw&s=qI64vSH3y6q8wJhcqpI4dXYma-i1RTtlxgKwKwhFWWo&e= > Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > AuthorDate: Wed, 9 Nov 2016 16:35:51 +0100 > Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > CommitDate: Wed, 9 Nov 2016 21:05:01 +0100 > > x86/cpu: Deal with broken firmware (VMWare/XEN) > > Both ACPI and MP specifications require that the APIC id in the respective > tables must be the same as the APIC id in CPUID. > > The kernel retrieves the physical package id from the APIC id during the > ACPI/MP table scan and builds the physical to logical package map. The > physical package id which is used after a CPU comes up is retrieved from > CPUID. So we rely on ACPI/MP tables and CPUID agreeing in that respect. > > There exist VMware and XEN implementations which violate the spec. As a > result the physical to logical package map, which relies on the ACPI/MP > tables does not work on those systems, because the CPUID initialized > physical package id does not match the firmware id. This causes system > crashes and malfunction due to invalid package mappings. For documentation purpose let me note that, VMware VMs running at virtual hardware version 9 and above don't have this ACPI/MP and CPUID divergence on the package id. So not everyone will see this issue on their VMs, this bug is limited to folks running at virtual hardware version 8 and prior. It's good that we can workaround the platform bug for those VMs, thanks for adding these checks. Alok > > The only way to cure this is to sanitize the physical package id after the > CPUID enumeration and yell when the APIC ids are different. Fix up the > initial APIC id, which is fine as it is only used printout purposes. > > If the physical package IDs differ yell and use the package information > from the ACPI/MP tables so the existing logical package map just works. > > Chas provided the resulting dmesg output for his affected 4 virtual > sockets, 1 core per socket VM: > > [Firmware Bug]: CPU1: APIC id mismatch. Firmware: 1 CPUID: 2 > [Firmware Bug]: CPU1: Using firmware package id 1 instead of 2 > .... > > Reported-and-tested-by: "Charles (Chas) Williams" <ciwillia@xxxxxxxxxxx>, > Reported-by: M. Vefa Bicakci <m.v.b@xxxxxxxxxx> > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Cc: Borislav Petkov <bp@xxxxxxxxx> > Cc: Alok Kataria <akataria@xxxxxxxxxx> > Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> > Cc: #4.6+ <stable@vger,kernel.org> > Link: https://urldefense.proofpoint.com/v2/url?u=http-3A__lkml.kernel.org_r_alpine.DEB.2.20.1611091613540.3501-40nanos&d=CwIDaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=2AkLWShm6V8Nuu8ZZ-80Flo6y0XxCGmO1xrsAeRArAE&m=WBsB4JFr-Dct0um4Kf8QAxC7w6p-Mlk3H-LwItQJ7Fw&s=HNQMGUrw_s6Mc_oyREBnD4TrUjERbLcH1viAZr-aFPY&e= > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > --- > arch/x86/kernel/cpu/common.c | 32 ++++++++++++++++++++++++++++++-- > 1 file changed, 30 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index 9bd910a..cc9e980 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -979,6 +979,35 @@ static void x86_init_cache_qos(struct cpuinfo_x86 *c) > } > > /* > + * The physical to logical package id mapping is initialized from the > + * acpi/mptables information. Make sure that CPUID actually agrees with > + * that. > + */ > +static void sanitize_package_id(struct cpuinfo_x86 *c) > +{ > +#ifdef CONFIG_SMP > + unsigned int pkg, apicid, cpu = smp_processor_id(); > + > + apicid = apic->cpu_present_to_apicid(cpu); > + pkg = apicid >> boot_cpu_data.x86_coreid_bits; > + > + if (apicid != c->initial_apicid) { > + pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x CPUID: %x\n", > + cpu, apicid, c->initial_apicid); > + c->initial_apicid = apicid; > + } > + if (pkg != c->phys_proc_id) { > + pr_err(FW_BUG "CPU%u: Using firmware package id %u instead of %u\n", > + cpu, pkg, c->phys_proc_id); > + c->phys_proc_id = pkg; > + } > + c->logical_proc_id = topology_phys_to_logical_pkg(pkg); > +#else > + c->logical_proc_id = 0; > +#endif > +} > + > +/* > * This does the hard work of actually picking apart the CPU stuff... > */ > static void identify_cpu(struct cpuinfo_x86 *c) > @@ -1103,8 +1132,7 @@ static void identify_cpu(struct cpuinfo_x86 *c) > #ifdef CONFIG_NUMA > numa_add_cpu(smp_processor_id()); > #endif > - /* The boot/hotplug time assigment got cleared, restore it */ > - c->logical_proc_id = topology_phys_to_logical_pkg(c->phys_proc_id); > + sanitize_package_id(c); > } > > /* ��.n��������+%������w��{.n�����{��ة��)��jg��������ݢj����G�������j:+v���w�m������w�������h�����٥
![]() |