On 04/11/2012 09:59 PM, Eric Northup wrote: > On Wed, Apr 11, 2012 at 7:45 AM, Avi Kivity <avi@xxxxxxxxxx> wrote: > > On 04/11/2012 05:11 AM, Peijie Yu wrote: > >> For this problem, i found that panic is caused by > >> BUG_ON(in_nmi()) which means NMI happened during another NMI Context; > >> But i check the Intel Technical Manual and found "While an NMI > >> interrupt handler is executing, the processor disables additional > >> calls to the NMI handler until the next IRET instruction is executed." > >> So, how this happen? > >> > > > > The NMI path for kvm is different; the processor exits from the guest > > with NMIs blocked, then executes kvm code until it issues "int $2" in > > vmx_complete_interrupts(). If an IRET is executed in this path, then > > NMIs will be unblocked and nested NMIs may occur. > > > > One way this can happen is if we access the vmap area and incur a fault, > > between the VMEXIT and invoking the NMI handler. Or perhaps the NMI > > handler itself generates a fault. Or we have a debug exception in that path. > > > > Is this reproducible? > > As an FYI, there have been BIOSes whose SMI handlers ran IRETs. So > the NMI blocking can go away surprisingly. > > See 29.8 "NMI handling while in SMM" in the Intel SDM vol 3. Interesting, thanks. >From 29.8 it looks like you don't even need to issue IRET within SMM, since SMM doesn't save/restore the NMI blocking flag. However, this being a server, and the crash being in kvm code, I don't think we can rule out that this is a kvm bug. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html