Re: [RFC] KVM: Fix simultaneous NMIs

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Mon, 19 Sep 2011 10:54:21 -0300

On Thu, Sep 15, 2011 at 08:48:58PM +0300, Avi Kivity wrote:
> On 09/15/2011 08:25 PM, Jan Kiszka wrote:
> >>
> >>  I think so.  Suppose the vcpu enters just after kvm_make_request(); it
> >>  sees KVM_REQ_EVENT and clears it, but doesn't see nmi_pending because it
> >>  wasn't set set.  Then comes a kick, the guest is reentered with
> >>  nmi_pending set but KVM_REQ_EVENT clear and sails through the check and
> >>  enters the guest.  The NMI is delayed until the next KVM_REQ_EVENT.
> >
> >That makes sense - and the old code looks more strange now.
> 
> I think it dates to the time all NMIs were synchronous.
> 
> >>>>
> >>>>    	/* try to inject new event if pending */
> >>>>   -	if (vcpu->arch.nmi_pending) {
> >>>>   +	if (atomic_read(&vcpu->arch.nmi_pending)) {
> >>>>    		if (kvm_x86_ops->nmi_allowed(vcpu)) {
> >>>>   -			vcpu->arch.nmi_pending = false;
> >>>>   +			atomic_dec(&vcpu->arch.nmi_pending);
> >>>
> >>>  Here we lost NMIs in the past by overwriting nmi_pending while another
> >>>  one was already queued, right?
> >>
> >>  One place, yes.  The other is kvm_inject_nmi() - if the first nmi didn't
> >>  get picked up by the vcpu by the time the second nmi arrives, we lose
> >>  the second nmi.
> >
> >Thinking this through again, it's actually not yet clear to me what we
> >are modeling here: If two NMI events arrive almost perfectly in
> >parallel, does the real hardware guarantee that they will always cause
> >two NMI events in the CPU? Then this is required.
> 
> It's not 100% clear from the SDM, but this is what I understood from
> it.  And it's needed - the NMI handlers are now being reworked to
> handle just one NMI source (hopefully the cheapest) in the handler,
> and if we detect a back-to-back NMI, handle all possible NMI
> sources.  This optimization is needed in turn so we can use Jeremy's
> paravirt spinlock framework, which requires a sleep primitive and a
> wake-up-even-if-the-sleeper-has-interrupts-disabled primitive.  i
> thought of using HLT and NMIs respectively, but that means we need a
> cheap handler (i.e. don't go reading PMU MSRs).
> 
> >Otherwise I just lost understanding again why we were loosing NMIs here
> >and in kvm_inject_nmi (maybe elsewhere then?).
> 
> Because of that.
> 
> >>
> >>>>    	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> >>>>    		inject_pending_event(vcpu);
> >>>>
> >>>>    		/* enable NMI/IRQ window open exits if needed */
> >>>>   -		if (nmi_pending)
> >>>>   +		if (atomic_read(&vcpu->arch.nmi_pending)
> >>>>   +		&&   nmi_in_progress(vcpu))
> >>>
> >>>  Is nmi_pending&&   !nmi_in_progress possible at all?
> >>
> >>  Yes, due to NMI-blocked-by-STI.  A really touchy area.
> >And we don't need the window exit notification then? I don't understand
> >what nmi_in_progress is supposed to do here.
> 
> We need the window notification in both cases.  If we're recovering
> from STI, then we don't need to collapse NMIs.  If we're completing
> an NMI handler, then we do need to collapse NMIs (since the queue
> length is two, and we just completed one).

I don't understand what is the point with nmi_in_progress, and the above
hunk, either. Can't inject_nmi do:

if (nmi_injected + atomic_read(nmi_pending) < 2)
    atomic_inc(nmi_pending)

Instead of collapsing somewhere else? You'd also have to change
nmi_injected handling in arch code so its value is not "hidden", in
complete_interrupts().

> >>
> >>>  If not, what will happen next?
> >>
> >>  The NMI window will open and we'll inject the NMI.
> >
> >How will we know this? We do not request the exit, that's my worry.
> 
> I think we do? Oh, but this patch breaks it.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html