On 02/17/2017 at 05:07 PM, Borislav Petkov wrote: > On Fri, Feb 17, 2017 at 09:53:21AM +0800, Xunlei Pang wrote: >> It changes the value of cpu_online_mask/etc which will cause confusion to vmcore analysis. > Then export the crashing_cpu variable, initialize it to something > invalid in the first kernel, -1 for example, and test it in the #MC > handlier like this: > > int cpu; > > ... > > cpu = smp_processor_id(); > > if (cpu_is_offline(cpu) || > ((crashing_cpu != -1) && (crashing_cpu != cpu)) { > u64 mcgstatus; > > mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS); > if (mcgstatus & MCG_STATUS_RIPV) { > mce_wrmsrl(MSR_IA32_MCG_STATUS, 0); > return; > } > } Yes, it is doable, I will do some tests later. >> Moreover, for the code(see comment inlined) >> >> if (cpu_is_offline(smp_processor_id())) { >> u64 mcgstatus; >> >> mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS); >> if (mcgstatus & MCG_STATUS_RIPV) { // This condition may be not true, the mce triggered on kdump cpu >> // doesn't need to have this bit set for the other cpus remain in 1st kernel. > Is this on kvm or on a real hardware? Because for kvm I don't care. And > don't say "theoretically". > It's from my understanding, I didn't get the explicit description from the intel SDM on this point. If a broadcast SRAO comes on real hardware, will MSR_IA32_MCG_STATUS of each cpu have MCG_STATUS_RIPV bit set? Regards, Xunlei