Re: [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2017-03-09 9:23 GMT+08:00 Wanpeng Li <kernellwp@xxxxxxxxx>:
> 2016-12-20 0:17 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
>> Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU
>> is blocked", 2015-09-18) the posted interrupt descriptor is checked
>> unconditionally for PIR.ON.  Therefore we don't need KVM_REQ_EVENT to
>> trigger the scan and, if NMIs or SMIs are not involved, we can avoid
>> the complicated event injection path.
>>
>> Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been
>> there since APICv was introduced.
>>
>> However, without the KVM_REQ_EVENT safety net KVM needs to be much
>> more careful about races between vmx_deliver_posted_interrupt and
>> vcpu_enter_guest.  First, the IPI for posted interrupts may be issued
>> between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts.
>> If that happens, kvm_trigger_posted_interrupt returns true, but
>> smp_kvm_posted_intr_ipi doesn't do anything about it.  The guest is
>> entered with PIR.ON, but the posted interrupt IPI has not been sent
>> and the interrupt is only delivered to the guest on the next vmentry
>> (if any).  To fix this, disable interrupts before setting vcpu->mode.
>> This ensures that the IPI is delayed until the guest enters non-root mode;
>> it is then trapped by the processor causing the interrupt to be injected.
>>
>> Second, the IPI may be issued between
>>
>>                         kvm_x86_ops->hwapic_irr_update(vcpu,
>>                                 kvm_lapic_find_highest_irr(vcpu));
>>
>> and vcpu->mode = IN_GUEST_MODE.  In this case, kvm_vcpu_kick is called
>> but it (correctly) doesn't do anything because it sees vcpu->mode ==
>> OUTSIDE_GUEST_MODE.  Again, the guest is entered with PIR.ON but no
>> posted interrupt IPI is pending; this time, the fix for this is to move
>> the RVI update after IN_GUEST_MODE.
>>
>> Both issues were previously masked by the liberal usage of KVM_REQ_EVENT.
>> In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting
>> in another vmentry which would inject the interrupt.
>>
>> This saves about 300 cycles on the self_ipi_* tests of vmexit.flat.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>> ---
>>  arch/x86/kvm/lapic.c | 11 ++++-------
>>  arch/x86/kvm/vmx.c   |  8 +++++---
>>  arch/x86/kvm/x86.c   | 44 +++++++++++++++++++++++++-------------------
>>  3 files changed, 34 insertions(+), 29 deletions(-)
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index f644dd1dbe71..5ea94b622e88 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -385,12 +385,8 @@ int __kvm_apic_update_irr(u32 *pir, void *regs)
>>  int kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
>>  {
>>         struct kvm_lapic *apic = vcpu->arch.apic;
>> -       int max_irr;
>>
>> -       max_irr = __kvm_apic_update_irr(pir, apic->regs);
>> -
>> -       kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -       return max_irr;
>> +       return __kvm_apic_update_irr(pir, apic->regs);
>>  }
>>  EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
>>
>> @@ -423,9 +419,10 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
>>         vcpu = apic->vcpu;
>>
>>         if (unlikely(vcpu->arch.apicv_active)) {
>> -               /* try to update RVI */
>> +               /* need to update RVI */
>>                 apic_clear_vector(vec, apic->regs + APIC_IRR);
>> -               kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +               kvm_x86_ops->hwapic_irr_update(vcpu,
>> +                               apic_find_highest_irr(apic));
>>         } else {
>>                 apic->irr_pending = false;
>>                 apic_clear_vector(vec, apic->regs + APIC_IRR);
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 27e40b180242..3dd4fad35a3e 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -5062,9 +5062,11 @@ static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>>         if (pi_test_and_set_pir(vector, &vmx->pi_desc))
>>                 return;
>>
>> -       r = pi_test_and_set_on(&vmx->pi_desc);
>> -       kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -       if (r || !kvm_vcpu_trigger_posted_interrupt(vcpu))
>> +       /* If a previous notification has sent the IPI, nothing to do.  */
>> +       if (pi_test_and_set_on(&vmx->pi_desc))
>> +               return;
>> +
>> +       if (!kvm_vcpu_trigger_posted_interrupt(vcpu))
>>                 kvm_vcpu_kick(vcpu);
>>  }
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c666414adc1d..725473ba6dd3 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -6710,19 +6710,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>                         kvm_hv_process_stimers(vcpu);
>>         }
>>
>> -       /*
>> -        * KVM_REQ_EVENT is not set when posted interrupts are set by
>> -        * VT-d hardware, so we have to update RVI unconditionally.
>> -        */
>> -       if (kvm_lapic_enabled(vcpu)) {
>> -               /*
>> -                * Update architecture specific hints for APIC
>> -                * virtual interrupt delivery.
>> -                */
>> -               if (kvm_x86_ops->sync_pir_to_irr)
>> -                       kvm_x86_ops->sync_pir_to_irr(vcpu);
>> -       }
>> -
>>         if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>>                 ++vcpu->stat.req_event;
>>                 kvm_apic_accept_events(vcpu);
>> @@ -6767,20 +6754,39 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>         kvm_x86_ops->prepare_guest_switch(vcpu);
>>         if (vcpu->fpu_active)
>>                 kvm_load_guest_fpu(vcpu);
>> +
>> +       /*
>> +        * Disabling IRQs before setting IN_GUEST_MODE.  Posted interrupt
>> +        * IPI are then delayed after guest entry, which ensures that they
>> +        * result in virtual interrupt delivery.
>> +        */
>> +       local_irq_disable();
>>         vcpu->mode = IN_GUEST_MODE;
>>
>>         srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>>
>>         /*
>> -        * We should set ->mode before check ->requests,
>> -        * Please see the comment in kvm_make_all_cpus_request.
>> -        * This also orders the write to mode from any reads
>> -        * to the page tables done while the VCPU is running.
>> -        * Please see the comment in kvm_flush_remote_tlbs.
>> +        * 1) We should set ->mode before checking ->requests.  Please see
>> +        * the comment in kvm_make_all_cpus_request.
>> +        *
>> +        * 2) For APICv, we should set ->mode before checking PIR.ON.  This
>> +        * pairs with the memory barrier implicit in pi_test_and_set_on
>> +        * (see vmx_deliver_posted_interrupt).
>> +        *
>> +        * 3) This also orders the write to mode from any reads to the page
>> +        * tables done while the VCPU is running.  Please see the comment
>> +        * in kvm_flush_remote_tlbs.
>>          */
>>         smp_mb__after_srcu_read_unlock();
>>
>> -       local_irq_disable();
>
> The local_irq_disable() movement is unnecessary if you move sync_pir_to_irr.

In addition, this movement will increase the time of irq disable to
some degree. Do you think I can send a patch to revert it?

Regards,
Wanpeng Li

>
> - IPI after vcpu->mode = IN_GUEST_MODE and interrupt disable, PI is
> successfully.
> - IPI between vcpu->mode = IN_GUEST_MODE and interrupt disable, the
> sync_ir_to_irr will catch the PIR and set RVI.
>
> Regards,
> Wanpeng Li
>
>> +       if (kvm_lapic_enabled(vcpu)) {
>> +               /*
>> +                * This handles the case where a posted interrupt was
>> +                * notified with kvm_vcpu_kick.
>> +                */
>> +               if (kvm_x86_ops->sync_pir_to_irr)
>> +                       kvm_x86_ops->sync_pir_to_irr(vcpu);
>> +       }
>>
>>         if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
>>             || need_resched() || signal_pending(current)) {
>> --
>> 1.8.3.1
>>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux