On 2011-09-15 19:02, Avi Kivity wrote: > On 09/15/2011 07:01 PM, Jan Kiszka wrote: >> On 2011-09-15 16:45, Avi Kivity wrote: >>> If simultaneous NMIs happen, we're supposed to queue the second >>> and next (collapsing them), but currently we sometimes collapse >>> the second into the first. >> >> Can you describe the race in a few more details here ("sometimes" sounds >> like "I don't know when" :) )? > > In this case it was "I'm in a hurry". > >>> >>> void kvm_inject_nmi(struct kvm_vcpu *vcpu) >>> { >>> + atomic_inc(&vcpu->arch.nmi_pending); >>> kvm_make_request(KVM_REQ_EVENT, vcpu); >>> - vcpu->arch.nmi_pending = 1; >> >> Does the reordering matter? > > I think so. Suppose the vcpu enters just after kvm_make_request(); it > sees KVM_REQ_EVENT and clears it, but doesn't see nmi_pending because it > wasn't set set. Then comes a kick, the guest is reentered with > nmi_pending set but KVM_REQ_EVENT clear and sails through the check and > enters the guest. The NMI is delayed until the next KVM_REQ_EVENT. That makes sense - and the old code looks more strange now. > >> Do we need barriers? > > Yes. > >> >>> @@ -5570,9 +5570,9 @@ static void inject_pending_event(struct kvm_vcpu *vcpu) >>> } >>> >>> /* try to inject new event if pending */ >>> - if (vcpu->arch.nmi_pending) { >>> + if (atomic_read(&vcpu->arch.nmi_pending)) { >>> if (kvm_x86_ops->nmi_allowed(vcpu)) { >>> - vcpu->arch.nmi_pending = false; >>> + atomic_dec(&vcpu->arch.nmi_pending); >> >> Here we lost NMIs in the past by overwriting nmi_pending while another >> one was already queued, right? > > One place, yes. The other is kvm_inject_nmi() - if the first nmi didn't > get picked up by the vcpu by the time the second nmi arrives, we lose > the second nmi. Thinking this through again, it's actually not yet clear to me what we are modeling here: If two NMI events arrive almost perfectly in parallel, does the real hardware guarantee that they will always cause two NMI events in the CPU? Then this is required. Otherwise I just lost understanding again why we were loosing NMIs here and in kvm_inject_nmi (maybe elsewhere then?). > >>> if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { >>> inject_pending_event(vcpu); >>> >>> /* enable NMI/IRQ window open exits if needed */ >>> - if (nmi_pending) >>> + if (atomic_read(&vcpu->arch.nmi_pending) >>> + && nmi_in_progress(vcpu)) >> >> Is nmi_pending&& !nmi_in_progress possible at all? > > Yes, due to NMI-blocked-by-STI. A really touchy area. And we don't need the window exit notification then? I don't understand what nmi_in_progress is supposed to do here. > >> Is it rather a BUG >> condition? > > No. > >> If not, what will happen next? > > The NMI window will open and we'll inject the NMI. How will we know this? We do not request the exit, that's my worry. > But I think we have > a bug here - we should only kvm_collapse_nmis() if an NMI handler was > indeed running, yet we do it unconditionally. > >>> >>> +static inline void kvm_collapse_pending_nmis(struct kvm_vcpu *vcpu) >>> +{ >>> + /* Collapse all NMIs queued while an NMI handler was running to one */ >>> + if (atomic_read(&vcpu->arch.nmi_pending)) >>> + atomic_set(&vcpu->arch.nmi_pending, 1); >> >> Is it OK that NMIs injected after the collapse will increment this to> >> 1 again? Or is that impossible? >> > > It's possible and okay. We're now completing execution of IRET. Doing > atomic_set() after atomic_inc() means the NMI happened before IRET > completed, and vice versa. Since these events are asynchronous, we're > free to choose one or the other (a self-IPI-NMI just before the IRET > must be swallowed, and a self-IPI-NMI just after the IRET would only be > executed after the next time around the handler). Need to think through this separately. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html