Re: [PATCH 1/2] KVM: x86: avoid simultaneous queueing of both IRQ and SMI

Wanpeng Li <kernellwp@xxxxxxxxx> · Sun, 5 Jun 2016 11:28:33 +0800

2016-06-02 2:06 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
>
>
> On 01/06/2016 18:40, Radim Krčmář wrote:
>> 2016-06-01 14:35+0200, Paolo Bonzini:
>>> If the processor exits to KVM while delivering an interrupt,
>>> the hypervisor then requeues the interrupt for the next vmentry.
>>> Trying to enter SMM in this same window causes to enter non-root
>>> mode in emulated SMM (i.e. with IF=0) and with a request to
>>> inject an IRQ (i.e. with a valid VM-entry interrupt info field).
>>> This is invalid guest state (SDM 26.3.1.4 "Check on Guest RIP
>>> and RFLAGS") and the processor fails vmentry.
>>>
>>> The fix is to defer the injection from KVM_REQ_SMI to KVM_REQ_EVENT,
>>> like we already do for e.g. NMIs.  This patch doesn't change the
>>> name of the process_smi function so that it can be applied to
>>> stable releases.  The next patch will modify the names so that
>>> process_nmi and process_smi process respectively KVM_REQ_NMI and
>>> KVM_REQ_SMI.
>>>
>>> This is especially common with Windows, probably due to the
>>> self-IPI trick that it uses to deliver deferred procedure
>>> calls (DPCs).
>>>
>>> Reported-by: Laszlo Ersek <lersek@xxxxxxxxxx>
>>> Fixes: 64d6067057d9658acb8675afcfba549abdb7fc16
>>> Cc: stable@xxxxxxxxxxxxxxx
>>> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>>> ---
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> @@ -6098,7 +6094,10 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
>>>      }
>>>
>>>      /* try to inject new event if pending */
>>> -    if (vcpu->arch.nmi_pending && kvm_x86_ops->nmi_allowed(vcpu)) {
>>> +    if (vcpu->arch.smi_pending && !is_smm(vcpu)) {
>>
>> Clearing smi_pending in kvm_vcpu_reset() would be safer now that SMI can
>> be injected without a request or RSM.
>
> Indeed.
>
>>> +            --vcpu->arch.smi_pending;
>>
>> (I'd use 'vcpu->arch.smi_pending = false', to make it clearer that we
>>  don't want multiple pending SMIs, unlike NMIs.  smi_pending is bool,
>>  so the generated code should be identical.)
>
> Right.  Making the code superficially similar for SMI and NMI was nice;
> however, as discussed a while ago we could probably make nmi_pending a
> bool too.
>
>>> +            process_smi(vcpu);
>>> +    } else if (vcpu->arch.nmi_pending && kvm_x86_ops->nmi_allowed(vcpu)) {
>>>              --vcpu->arch.nmi_pending;
>>
>>
>>> @@ -6621,8 +6631,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>>
>>>      kvm_load_guest_xcr0(vcpu);
>>>
>>> -    if (req_immediate_exit)
>>> +    if (req_immediate_exit) {
>>> +            kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>              smp_send_reschedule(vcpu->cpu);
>>
>> (Is this a fix for non-smi cases too?)
>
> No, I don't think so, the existing req_immediate_exit case is only after
> a VMLAUNCH/VMRESUME vmexit, in which case we already have a
>
>         if (vmx->nested.nested_run_pending)
>                 kvm_make_request(KVM_REQ_EVENT, vcpu);
>
> in vmx_vcpu_run.

Do you think this can be removed since it blindly request a
KVM_REQ_EVENT even if there is no still-pending event to L1 which
blocked by nested_run_pending, however, req_immediate_exit can
indicate there is a pending event blocked by nested_run_pending and
the request KVM_REQUEST_EVENT added in your patch can guarantee inject
this pending event in the next nested vmexit.

Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html