On 23/04/20 17:35, Sean Christopherson wrote: > On Thu, Apr 23, 2020 at 05:10:45PM +0200, Paolo Bonzini wrote: >> On 23/04/20 16:42, Sean Christopherson wrote: >>> On Tue, Apr 14, 2020 at 04:11:07PM -0400, Cathy Avery wrote: >>>> With NMI intercept moved to check_nested_events there is a race >>>> condition where vcpu->arch.nmi_pending is set late causing >>> How is nmi_pending set late? The KVM_{G,S}ET_VCPU_EVENTS paths can't set >>> it because the current KVM_RUN thread holds the mutex, and the only other >>> call to process_nmi() is in the request path of vcpu_enter_guest, which has >>> already executed. >>> >> I think the actual cause is priority inversion between NMI and >> interrupts, because NMI is added last in patch 1. > Ah, that makes more sense. I stared/glared at this exact code for a long > while and came to the conclusion that the "late" behavior was exclusive to > interrupts, would have been a shame if all that glaring was for naught. > Ah no, it's a bug in Cathy's patch and it's a weird one. The problem is that on AMD you exit guest mode with the NMI latched and GIF=0. So check_nested_events should enable the NMI window in addition to causing a vmexit. So why does it work? Because on AMD we don't have (yet) nested_run_pending, so we just check if we already have a vmexit scheduled and if so return -EBUSY. The second call causes inject_pending_event to return -EBUSY and thus go through KVM_REQ_EVENT again, which enables the NMI window. Paolo