On 02/21/2017 at 04:26 AM, Borislav Petkov wrote: > On Mon, Feb 20, 2017 at 09:29:24PM +0800, Xunlei Pang wrote: >> There is a small window between crash and kdump kernel boot, so >> if a SRAO comes within this window it will also cause the mce >> synchronization problem on the crashing cpu if we don't bail out the >> crashing cpu. > You mean, in the window between, kdump kernel starts writing out memory > and the second, kexec-ed kernel? Not kdump kernel starts dumping, just during nmi_shootdown_cpus(), if some MCE comes after crashing_cpu was set and we don't skip crashing_cpu, then the crashing cpu will enter mce handler and trigger the synchronization issue. > > If so, please add that information to the place in do_machine_check() > where we check crashing_cpu so that we know why we're doing this > temporary ignore of #MC. Ok, will add, thanks for the feedback. Regards, Xunlei