On Mon, Jan 23, 2017 at 03:50:56PM +0100, Borislav Petkov wrote: > On Mon, Jan 23, 2017 at 09:35:53PM +0800, Xunlei Pang wrote: > > One possible timing sequence would be: > > 1st kernel running on multiple cpus panicked > > then the crash dump code starts > > the crash dump code stops the others cpus except the crashing one > > 2nd kernel boots up on the crash cpu with "nr_cpus=1" > > some broadcasted mce comes on some cpu amongst the other cpus(not the crashing cpu) > > Where does this broadcasted MCE come from? > > The crash dump code triggered it? Or it happened before the panic()? > > Are you talking about an *actual* sequence which you're experiencing on > real hw or is this something hypothetical? If the system had experienced some memory corruption, but recovered ... then there would be some pages sitting around that the old kernel had marked as POISON and stopped using. The kexec'd kernel doesn't know about these, so may touch that memory while taking a crash dump ... and then you have a broadcast machine check (on older[1] Intel CPUs that don't support local machine check). This is hard to work around. You really need all the CPUs to have set CR4.MCE=1 (if any didn't, then they will force a reset when they see the machine check). Also you need to make sure that they jump to the copy of do_machine_check() in the new kernel, not the old kernel. A while ago I played with the nr_cpus=N code to have it bring all the CPUs far enough online to get the machine check initialization done, then any extras above "N" just go back offline again. But I never got this to work reliably. -Tony [1] older == all released ones, at the moment.