RE: [PATCH v7 2/3] x86, apicv: add virtual interrupt delivery support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Gleb Natapov wrote on 2012-12-27:
> On Thu, Dec 27, 2012 at 02:24:04AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-21:
>>> On Fri, Dec 21, 2012 at 09:39:20AM -0200, Marcelo Tosatti wrote:
>>>> On Fri, Dec 21, 2012 at 09:51:40AM +0200, Gleb Natapov wrote:
>>>>> On Thu, Dec 20, 2012 at 08:59:11PM -0200, Marcelo Tosatti wrote:
>>>>>> On Mon, Dec 17, 2012 at 01:30:49PM +0800, Yang Zhang wrote:
>>>>>>> From: Yang Zhang <yang.z.zhang@xxxxxxxxx>
>>>>>>> 
>>>>>>> Virtual interrupt delivery avoids KVM to inject vAPIC interrupts
>>>>>>> manually, which is fully taken care of by the hardware. This needs
>>>>>>> some special awareness into existing interrupr injection path:
>>>>>>> 
>>>>>>> - for pending interrupt, instead of direct injection, we may need
>>>>>>>   update architecture specific indicators before resuming to
>>>>>>>   guest. - A pending interrupt, which is masked by ISR, should be
>>>>>>>   also considered in above update action, since hardware will
>>>>>>>   decide when to inject it at right time. Current has_interrupt
>>>>>>>   and get_interrupt only returns a valid vector from injection
>>>>>>>   p.o.v.
>>>>>>> Signed-off-by: Kevin Tian <kevin.tian@xxxxxxxxx>
>>>>>>> Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx>
>>>>>>> ---
>>>>>>>  arch/ia64/kvm/lapic.h           |    6 ++
>>>>>>>  arch/x86/include/asm/kvm_host.h |    6 ++
>>>>>>>  arch/x86/include/asm/vmx.h      |   11 +++ arch/x86/kvm/irq.c
>>>>>>>         |   56 +++++++++++++- arch/x86/kvm/lapic.c            |
> 65
>>>>>>>  ++++++++++------- arch/x86/kvm/lapic.h            |   28 ++++++-
>>>>>>>  arch/x86/kvm/svm.c              |   24 ++++++
> arch/x86/kvm/vmx.c
>>>>>>>            |  154 ++++++++++++++++++++++++++++++++++++++-
>>>>>>>  arch/x86/kvm/x86.c              |   11 ++-
> include/linux/kvm_host.h
>>>>>>>         |    2 + virt/kvm/ioapic.c               |   36 +++++++++
>>>>>>>  virt/kvm/ioapic.h               |    1 + virt/kvm/irq_comm.c
>>>>>>>       |   20 +++++ 13 files changed, 379 insertions(+), 41
>>>>>>>  deletions(-)
>>>>>>> diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h
>>>>>>> index c5f92a9..cb59eb4 100644
>>>>>>> --- a/arch/ia64/kvm/lapic.h
>>>>>>> +++ b/arch/ia64/kvm/lapic.h
>>>>>>> @@ -27,4 +27,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu,
> struct
>>> kvm_lapic_irq *irq);
>>>>>>>  #define kvm_apic_present(x) (true)
>>>>>>>  #define kvm_lapic_enabled(x) (true)
>>>>>>> +static inline void kvm_update_eoi_exitmap(struct kvm *kvm,
>>>>>>> +					struct kvm_lapic_irq *irq)
>>>>>>> +{
>>>>>>> +	/* IA64 has no apicv supporting, do nothing here */
>>>>>>> +}
>>>>>>> +
>>>>>>>  #endif
>>>>>>> diff --git a/arch/x86/include/asm/kvm_host.h
>>>>>>> b/arch/x86/include/asm/kvm_host.h index c431b33..b63a144 100644 ---
>>>>>>> a/arch/x86/include/asm/kvm_host.h +++
>>>>>>> b/arch/x86/include/asm/kvm_host.h @@ -697,6 +697,11 @@ struct
>>>>>>> kvm_x86_ops {
>>>>>>>  	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
>>>>>>>  	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
>>>>>>>  	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int
> irr);
>>>>>>> +	int (*has_virtual_interrupt_delivery)(struct kvm_vcpu *vcpu);
>>>>>>> +	void (*update_apic_irq)(struct kvm_vcpu *vcpu, int max_irr);
>>>>>>> +	void (*update_eoi_exitmap)(struct kvm *kvm, struct kvm_lapic_irq
>>>>>>> *irq); +	void (*reset_eoi_exitmap)(struct kvm_vcpu *vcpu); +	void
>>>>>>> (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
>>>>>>>  	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>>>>>>>  	int (*get_tdp_level)(void);
>>>>>>>  	u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool
>>> is_mmio);
>>>>>> 
>>>>>> EOI exit bitmap is problematic (its racy). Please do this:
>>>>>> 
>>>>>> 1. Make a synchronous (1) KVM_REQ_EOIBITMAP request on IOAPIC
>>>>>> register modifications which require EOI exit bitmap updates. 2. On
>>>>>> VM-entry, during KVM_REQ_EOIBITMAP processing, each checks IOAPIC
>>>>>> map and adjusts its own EOI exit bitmap VMCS registers.
>>>>>> 
>>>>>> 1) that waits until remote executing VCPUs have acknowledge the
>>>>>> request, using make_all_cpus_request (see virt/kvm/kvm_main.c),
>>>>>> similarly to remote TLB flushes.
>>>>>> 
>>>>>> What is the problem now: there is no control over _when_ a VCPU
>>>>>> updates its EOI exit bitmap VMCS register from the (remotely updated)
>>>>>> master EOI exit bitmap. The VCPU can be processing a
>>>>>> KVM_REQ_EOIBITMAP relative to a precedence IOAPIC register write
>>>>>> while the current IOAPIC register write is updating the EOI exit
>>>>>> bitmap. There is no way to fix that without locking (which can be
>>>>>> avoided if the IOAPIC->EOI exit bitmap synchronization is vcpu local).
>>>>>> 
>>>>> The race is benign. We have similar one for interrupt injection and
>>>>> the same race exists on a real HW. The two cases that can happen due
>>>>> to the race are:
>>>>> 
>>>>> 1. exitbitmap bit X is changed from 1 to 0
>>>>>   No problem. It is harmless to do an exit, on the next entry
>>>>>   exitbitmap will be fixed. 2. exitbitmap bit X is changed from 0 to 1
>>>>>   If vcpu serves X at the time this happens it was delivered as edge,
>>>>>   so no need to exit. The exitbitmap will be updated after the next
>>>>>   vmexit which will happen due to KVM_REQ_EOIBITMAP processing.
>>>> 
>>>> 1. Missed the case where bitmap is being reset (where EOI exit bitmaps
>>>> are zeroed). In this case vcpu enters guest with incorrect EOI exit
>>>> bitmap.
>>>> 
>>> Right, the bitmap reset us problematic indeed since it does not
>>> represent real vector configuration change.
>>> 
>>>> 2. Missed the case where current code allows vcpu to enter guest
>>>> with EOI exit bitmap unsynchronized relative to IOAPIC registers
>>>> (see one KVM_REQ made at a time, no IPI sent). In that case interrupt
>>>> can be delivered.
>>>> 
>>> Ugh, I was sure there is a kick there. Missing kick is just a bug of
>>> course.
>> Do you mean add a kick when making KVM_REQ_EOIBITMAP request?
>> 
> Of course, otherwise vcpu is running with stale bitmap.
Right.
 
>>>> Thus the suggestions to update bitmap locally, on entry. Do you
>>>> see any disadvantage?
>>>> 
>>> Only one. The recalculation logic is such that given a vector it
>>> calculates set of vcpus, so each vcpu will do this calculation for each
>>> vector and see if it is in the set instead of recalculating once. But
>>> this should be rare enough for us to not care.
>>> 
>>>> Other than that, there is a window between IOAPIC map update and
>>>> EOI bitmap request, where an interrupt can be delivered without
>>>> EOI bitmap being updated (which i think local updates don't cover,
>>>> either).
>>> Interrupt cannot be delivered through IOAPIC while bitmap is updated
>>> since IOAPIC has a lock.
>>> 
>>>> 
>>>>> But software really should take care of not changing interrupt vector
>>>>> configuration while there is an interrupt in flight with the same vector.
>>>> 
>>>> None of these are guest faults. As soon as interrupts are allowed they
>>>> must be handled properly (including synchronized EOI bitmap etc).
>>>> 
>>> All my cases are guest faults and the guest will get in trouble on real
>>> HW too with such behaviour. The case where we clear bitmap before
>>> recalculation is not a guest fault though and have to be dealt with
>>> somehow either with locks or your suggestion.
>> How about to set all bits in eoi bitmap before recalculation. As you
>> said it's harmless to do an vmexit. And eoibitmap will be updated on
>> next vmentry.
>> 
> If you set all bits before recalculation they will remain all set after
> recalculation too. I do not get what you propose here.
Yes. I forget it.

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux