On 4/23/20 10:42 AM, Sean Christopherson wrote:
On Tue, Apr 14, 2020 at 04:11:07PM -0400, Cathy Avery wrote:
With NMI intercept moved to check_nested_events there is a race
condition where vcpu->arch.nmi_pending is set late causing
How is nmi_pending set late? The KVM_{G,S}ET_VCPU_EVENTS paths can't set
it because the current KVM_RUN thread holds the mutex, and the only other
call to process_nmi() is in the request path of vcpu_enter_guest, which has
already executed.
You will have to forgive me as I am new to KVM and any help would be
most appreciated. This is what I noticed when an NMI intercept is
processed when it was implemented in check_nested_events.
When check_nested_events is called from inject_pending_event ...
check_nested_events needs to have already been called (kvm_vcpu_running
with vcpu->arch.nmi_pending = 1) to set up the NMI intercept and set
svm->nested.exit_required. Otherwise we do not exit from the second
checked_nested_events call ( code below ) with a return of -EBUSY which
allows us to immediately vmexit.
/*
* Call check_nested_events() even if we reinjected a previous
event
* in order for caller to determine if it should require
immediate-exit
* from L2 to L1 due to pending L1 events which require exit
* from L2 to L1.
*/
if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
r = kvm_x86_ops.check_nested_events(vcpu);
if (r != 0)
return r;
}
Unfortunately when kvm_vcpu_running is called vcpu->arch.nmi_pending is
not yet set.
Here is the trace snippet ( with some debug ) without the second call to
check_nested_events.
Thanks,
Cathy
qemu-system-x86-2029 [040] 232.168269: kvm_entry: vcpu 0
qemu-system-x86-2029 [040] 232.168271: kvm_exit: reason EXIT_MSR
rip 0x405371 info 1 0
qemu-system-x86-2029 [040] 232.168272: kvm_nested_vmexit: rip
405371 reason EXIT_MSR info1 1 info2 0 int_info 0 int_info_err 0
qemu-system-x86-2029 [040] 232.168273: kvm_apic: apic_write
APIC_ICR2 = 0x0
qemu-system-x86-2029 [040] 232.168274: kvm_apic: apic_write
APIC_ICR = 0x44400
qemu-system-x86-2029 [040] 232.168275: kvm_apic_ipi: dst 0 vec 0
(NMI|physical|assert|edge|self)
qemu-system-x86-2029 [040] 232.168277: kvm_apic_accept_irq: apicid
0 vec 0 (NMI|edge)
qemu-system-x86-2029 [040] 232.168278: kvm_msr: msr_write 830 = 0x44400
qemu-system-x86-2029 [040] 232.168279: bprint:
svm_check_nested_events: svm_check_nested_events reinj = 0, exit_req = 0
qemu-system-x86-2029 [040] 232.168279: bprint:
svm_check_nested_events: svm_check_nested_events nmi pending = 0
qemu-system-x86-2029 [040] 232.168279: bputs: vcpu_enter_guest:
inject_pending_event 1
qemu-system-x86-2029 [040] 232.168279: bprint:
svm_check_nested_events: svm_check_nested_events reinj = 0, exit_req = 0
qemu-system-x86-2029 [040] 232.168279: bprint:
svm_check_nested_events: svm_check_nested_events nmi pending = 1
qemu-system-x86-2029 [040] 232.168280: bprint: svm_nmi_allowed:
svm_nmi_allowed ret 1
qemu-system-x86-2029 [040] 232.168280: bputs: svm_inject_nmi:
svm_inject_nmi
qemu-system-x86-2029 [040] 232.168280: bprint: vcpu_enter_guest:
nmi_pending 0
qemu-system-x86-2029 [040] 232.168281: kvm_entry: vcpu 0
qemu-system-x86-2029 [040] 232.168282: kvm_exit: reason EXIT_NMI
rip 0x405373 info 1 0
qemu-system-x86-2029 [040] 232.168284: kvm_nested_vmexit_inject:
reason EXIT_NMI info1 1 info2 0 int_info 0 int_info_err 0
qemu-system-x86-2029 [040] 232.168285: kvm_entry: vcpu 0
the execution of check_nested_events to not setup correctly
for nested.exit_required. A second call to check_nested_events
allows the injectable nmi to be detected in time in order to
require immediate exit from L2 to L1.
Signed-off-by: Cathy Avery <cavery@xxxxxxxxxx>
---
arch/x86/kvm/x86.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 027dfd278a97..ecfafcd93536 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7734,10 +7734,17 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
vcpu->arch.smi_pending = false;
++vcpu->arch.smi_count;
enter_smm(vcpu);
- } else if (vcpu->arch.nmi_pending && kvm_x86_ops.nmi_allowed(vcpu)) {
- --vcpu->arch.nmi_pending;
- vcpu->arch.nmi_injected = true;
- kvm_x86_ops.set_nmi(vcpu);
+ } else if (vcpu->arch.nmi_pending) {
+ if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
+ r = kvm_x86_ops.check_nested_events(vcpu);
+ if (r != 0)
+ return r;
+ }
+ if (kvm_x86_ops.nmi_allowed(vcpu)) {
+ --vcpu->arch.nmi_pending;
+ vcpu->arch.nmi_injected = true;
+ kvm_x86_ops.set_nmi(vcpu);
+ }
} else if (kvm_cpu_has_injectable_intr(vcpu)) {
/*
* Because interrupts can be injected asynchronously, we are
--
2.20.1