RE: [PATCH v3 3/4] x86, apicv: add virtual interrupt delivery support

"Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> · Thu, 6 Dec 2012 07:20:28 +0000



Gleb Natapov wrote on 2012-12-06:
> On Thu, Dec 06, 2012 at 07:16:07AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2012-12-06:
>>> On Thu, Dec 06, 2012 at 02:55:16AM +0000, Zhang, Yang Z wrote:
>>>> Gleb Natapov wrote on 2012-12-05:
>>>>> On Wed, Dec 05, 2012 at 01:51:36PM +0000, Zhang, Yang Z wrote:
>>>>>> Gleb Natapov wrote on 2012-12-05:
>>>>>>> On Wed, Dec 05, 2012 at 06:02:59AM +0000, Zhang, Yang Z wrote:
>>>>>>>> Gleb Natapov wrote on 2012-12-05:
>>>>>>>>> On Wed, Dec 05, 2012 at 01:55:17AM +0000, Zhang, Yang Z wrote:
>>>>>>>>>> Gleb Natapov wrote on 2012-12-04:
>>>>>>>>>>> On Tue, Dec 04, 2012 at 06:39:50AM +0000, Zhang, Yang Z wrote:
>>>>>>>>>>>> Gleb Natapov wrote on 2012-12-03:
>>>>>>>>>>>>> On Mon, Dec 03, 2012 at 03:01:03PM +0800, Yang Zhang wrote:
>>>>>>>>>>>>>> Virtual interrupt delivery avoids KVM to inject vAPIC
>>>>>>>>>>>>>> interrupts manually, which is fully taken care of by the
>>>>>>>>>>>>>> hardware. This needs some special awareness into existing
>>>>>>>>>>>>>> interrupr injection path:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - for pending interrupt, instead of direct injection, we may
> need
>>>>>>>>>>>>>>   update architecture specific indicators before resuming to
>>>>>>>>>>>>>>   guest. - A pending interrupt, which is masked by ISR, should
>>>>>>>>>>>>>>   be also considered in above update action, since hardware
>>>>>>>>>>>>>>   will decide when to inject it at right time. Current
>>>>>>>>>>>>>>   has_interrupt and get_interrupt only returns a valid vector
>>>>>>>>>>>>>>   from injection p.o.v.
>>>>>>>>>>>>> Most of my previous comments still apply.
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> +void kvm_set_eoi_exitmap(struct kvm_vcpu *vcpu, int
>>>>>>>>>>>>>> vector, +		int trig_mode, int always_set) +{ +	if
>>>>>>>>>>>>>> (kvm_x86_ops->set_eoi_exitmap)
>>>>>>>>>>>>>> +		kvm_x86_ops->set_eoi_exitmap(vcpu, vector,
>>>>>>>>>>>>>> +					trig_mode, always_set); +} +
>>>>>>>>>>>>>>  /*
>>>>>>>>>>>>>>   * Add a pending IRQ into lapic.
>>>>>>>>>>>>>>   * Return 1 if successfully added and 0 if discarded.
>>>>>>>>>>>>>> @@ -661,6 +669,7 @@ static int __apic_accept_irq(struct
>>> kvm_lapic
>>>>>>>>> *apic,
>>>>>>>>>>> int
>>>>>>>>>>>>> delivery_mode,
>>>>>>>>>>>>>>  		if (unlikely(!apic_enabled(apic)))
>>>>>>>>>>>>>>  			break;
>>>>>>>>>>>>>> +		kvm_set_eoi_exitmap(vcpu, vector, trig_mode, 0);
>>>>>>>>>>>>> As I said in the last review rebuild the bitmap when ioapic
>>>>>>>>>>>>> or irq notifier configuration changes, user request bit to
>>>>>>>>>>>>> notify vcpus to reload the bitmap.
>>>>>>>>>>>> It is too complicated. When program ioapic entry, we cannot get
> the
>>>>>>> target
>>>>>>>>> vcpu
>>>>>>>>>>> easily. We need to read destination format register and
>>>>>>>>>>> logical destination register to find out target vcpu if using
>>>>>>>>>>> logical mode. Also, we must trap every modification to the two
>>>>>>>>>>> registers to update eoi bitmap. No need to check target vcpu.
>>>>>>>>>>> Enable exit on all vcpus for the vector
>>>>>>>>>> This is wrong. As we known, modern OS uses per VCPU vector. We
>>> cannot
>>>>>>>>> ensure all vectors have same trigger mode. And what's worse, the
>>>>>>>>> vector in another vcpu is used to handle high frequency
>>>>>>>>> interrupts(like 10G NIC), then it will hurt performance.
>>>>>>>>>> 
>>>>>>>>> I never saw OSes reuse vector used by ioapic, as far as I see this
>>>>>>>> Could you point out which code does this check in Linux kernel? I don't
>>>>>>>> see any special checks when Linux kernel allocates a vector.
>>>>>>>> 
>>>>>>> arch/x86/kernel/apic/io_apic.c:create_irq_nr(). It uses
>>>>>>> apic->target_cpus() to get cpu mask. target_cpus() return mask of all
>>>>>>> online cpus. Actually you wrote arch_pi_create_irq() in PI patches to
>>>>>>> workaround this behaviour and allocated vector per cpu, no?
>>>>>> Yes, when create an new irq, it will allocate vector from all online cpus.
> But
>>> after
>>>>> user changes the irq affinity, then the vector will reallocate with
>>>>> new cpumask. And this will leave the vector available on other cpus.
>>>>>> 
>>>>> Since during vector allocation all cpus are checked vector will not be
>>>>> reused if it is allocated on any cpu.
>>>> Sorry, I still cannot find this check in kernel. Can you point me out to it?
>>> You pointed to it above by yourself:
>>>   "Yes, when create an new irq, it will allocate vector from all online
>>> cpus"
>>> So if vector is allocated by at least one online cpu it cannot be reused
>>> during allocation.
>>> 
>>>> Also, I do see the vector reused by MSI and IOAPIC in my system.
>>>> 
>>> What is your system? What is your qemu command line? We only care if MSI
>>> uses the same vector as IOAPICs level interrupt but on different cpu. If
>>> this happens we can use apic_map to calculate per cpu eoi exit bitmap.
>>> 
>>>>>>> Are you aware of any guest that I can run, examine ioapic/apic
>>>>>>> configuration and see that the same vector is used on different vcpus
>>>>>>> for different devices? Can you point me to it?
>>>>>>> 
>>>>> Can you answer this?
>>>> I am sure Xen will reused the IOAPIC vector.
>>>> 
>>> What configuration it creates exactly? Also if Xen running as KVM guest
>>> will takes small performance hit it is not a big problem. Remember there
>>> is not correctness issue here.
>> Anyway, it's doesn't matter to discuss which OS will reuse the vector.
>> So use apic_map is the final decision?
>> 
> If you are firmly committed to per vcpu bitmap then yes. I tried to make
> it simpler for you :)
Ok. Got it. Thanks.:)

Best regards,
Yang


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html