Re: [kvm-unit-tests PATCH] x86: Remove test_multiple_nmi()

Nadav Amit <nadav.amit@xxxxxxxxx> · Wed, 24 Apr 2019 15:10:16 -0700

> On Apr 24, 2019, at 2:49 PM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> 
> On Wed, Apr 24, 2019 at 01:59:21PM -0700, Nadav Amit wrote:
>>> On Apr 24, 2019, at 1:55 PM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
>>> 
>>> On Tue, Apr 23, 2019 at 09:50:59PM -0700, nadav.amit@xxxxxxxxx wrote:
>>>> From: Nadav Amit <nadav.amit@xxxxxxxxx>
>>>> 
>>>> According to the discussion in [1], expecting nested NMIs never to be
>>> 
>>> s/nested/multiple pending
>>> 
>>>> collapsed is wrong.
>>> 
>>> It'd also be helpful to quote the SDM or APM, although it's admittedly
>>> difficult to find a relevant blurb in the SDM.  The only explict statement
>>> regarding the number of latched/pending NMIs I could find was for SMM:
>>> 
>>> 34.8 NMI HANDLING WHILE IN SMM
>>> NMI interrupts are blocked upon entry to the SMI handler. If an NMI request
>>> occurs during the SMI handler, it is latched and serviced after the processor
>>> exits SMM. Only one NMI request will be latched during the SMI handler.  If an
>>> NMI request is pending when the processor executes the RSM instruction, the NMI
>>> is serviced before the next instruction of the interrupted code sequence. This
>>> assumes that NMIs were not blocked before the SMI occurred. If NMIs were
>>> blocked before the SMI occurred, they are blocked after execution of RSM
>>> 
>>> All that being said, removing the test is correct as it's blatantly
>>> subject to a race condition between vCPUs.
>>> 
>>> It probably makes sense add a single threaded test that pends an NMI from
>>> inside the NMI handler to ensure that KVM pends NMIs correctly.  I'll send
>>> a patch.
>> 
>> Thanks, Sean. I thought that quoting you should be enough. ;-)
>> 
>> Paolo, please let me know if you have any further feedback, so I will know
>> what to include in v2.
> 
> Thinking about the test a bit more and rereading the KVM code, I actually
> think we should keep the test.
> 
> Architecturally I don't think there are any guarantees regarding
> simultaneous NMIs, but practically speaking the probability of NMIs
> being collapsed (on hardware) when NMIs aren't blocked is nil.  So while
> it may be architecturally legal for a VMM to drop an NMI in this case,
> it's reasonable for software to expect two NMIs to be received.
> 
> And that's why KVM explicitly allows two NMIs to be pending when NMIs
> aren't blocked.
> 
> Did you observe a test failure or was this found by inspection?

I observed a failure (not on KVM), but I now see that bare-metal indeed
allows this test to pass. So I can understand if you do not want to remove
the test.