On Mon, Apr 08, 2013 at 11:21:34AM +0000, Zhang, Yang Z wrote: > Gleb Natapov wrote on 2013-04-07: > > On Sun, Apr 07, 2013 at 01:16:51PM +0000, Zhang, Yang Z wrote: > >> Gleb Natapov wrote on 2013-04-07: > >>> On Sun, Apr 07, 2013 at 01:05:02PM +0000, Zhang, Yang Z wrote: > >>>> Gleb Natapov wrote on 2013-04-07: > >>>>> On Sun, Apr 07, 2013 at 12:39:32PM +0000, Zhang, Yang Z wrote: > >>>>>> Gleb Natapov wrote on 2013-04-07: > >>>>>>> On Sun, Apr 07, 2013 at 02:30:15AM +0000, Zhang, Yang Z wrote: > >>>>>>>> Gleb Natapov wrote on 2013-04-04: > >>>>>>>>> On Mon, Apr 01, 2013 at 08:40:13AM +0800, Yang Zhang wrote: > >>>>>>>>>> From: Yang Zhang <yang.z.zhang@xxxxxxxxx> > >>>>>>>>>> > >>>>>>>>>> Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx> > >>>>>>>>>> --- > >>>>>>>>>> arch/x86/kvm/lapic.c | 9 +++++++++ arch/x86/kvm/lapic.h | 2 > >>>>>>>>>> ++ virt/kvm/ioapic.c | 43 > >>>>>>>>>> +++++++++++++++++++++++++++++++++++++++++++ virt/kvm/ioapic.h > >>>>>>>>>> | 1 + 4 files changed, 55 insertions(+), 0 deletions(-) > >>>>>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > >>>>>>>>>> index 96ab160..9c041fa 100644 > >>>>>>>>>> --- a/arch/x86/kvm/lapic.c > >>>>>>>>>> +++ b/arch/x86/kvm/lapic.c > >>>>>>>>>> @@ -94,6 +94,14 @@ static inline int apic_test_vector(int vec, void > >>>>> *bitmap) > >>>>>>>>>> return test_bit(VEC_POS(vec), (bitmap) + REG_POS(vec)); > >>>>>>>>>> } > >>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector) > >>>>>>>>>> +{ > >>>>>>>>>> + struct kvm_lapic *apic = vcpu->arch.apic; > >>>>>>>>>> + > >>>>>>>>>> + return apic_test_vector(vector, apic->regs + APIC_ISR) || > >>>>>>>>>> + apic_test_vector(vector, apic->regs + APIC_IRR); > >>>>>>>>>> +} > >>>>>>>>>> + > >>>>>>>>>> static inline void apic_set_vector(int vec, void *bitmap) > >>>>>>>>>> { > >>>>>>>>>> set_bit(VEC_POS(vec), (bitmap) + REG_POS(vec)); > >>>>>>>>>> @@ -1665,6 +1673,7 @@ void kvm_apic_post_state_restore(struct > >>>>>>> kvm_vcpu > >>>>>>>>> *vcpu, > >>>>>>>>>> apic->highest_isr_cache = -1; > >>>>>>>>>> kvm_x86_ops->hwapic_isr_update(vcpu->kvm, > >>>>>>>>>> apic_find_highest_isr(apic)); kvm_make_request(KVM_REQ_EVENT, > >>>>>>>>>> vcpu); + kvm_rtc_irq_restore(vcpu); } > >>>>>>>>>> > >>>>>>>>>> void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu) > >>>>>>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h > >>>>>>>>>> index 967519c..004d2ad 100644 > >>>>>>>>>> --- a/arch/x86/kvm/lapic.h > >>>>>>>>>> +++ b/arch/x86/kvm/lapic.h > >>>>>>>>>> @@ -170,4 +170,6 @@ static inline bool > > kvm_apic_has_events(struct > >>>>>>>>> kvm_vcpu *vcpu) > >>>>>>>>>> return vcpu->arch.apic->pending_events; > >>>>>>>>>> } > >>>>>>>>>> +bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector); > >>>>>>>>>> + > >>>>>>>>>> #endif > >>>>>>>>>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c > >>>>>>>>>> index 8664812..0b12b17 100644 > >>>>>>>>>> --- a/virt/kvm/ioapic.c > >>>>>>>>>> +++ b/virt/kvm/ioapic.c > >>>>>>>>>> @@ -90,6 +90,47 @@ static unsigned long > > ioapic_read_indirect(struct > >>>>>>>>> kvm_ioapic *ioapic, > >>>>>>>>>> return result; > >>>>>>>>>> } > >>>>>>>>>> +static void rtc_irq_reset(struct kvm_ioapic *ioapic) > >>>>>>>>>> +{ > >>>>>>>>>> + ioapic->rtc_status.pending_eoi = 0; > >>>>>>>>>> + bitmap_zero(ioapic->rtc_status.dest_map, KVM_MAX_VCPUS); > >>>>>>>>>> +} > >>>>>>>>>> + > >>>>>>>>>> +static void rtc_irq_restore(struct kvm_ioapic *ioapic) > >>>>>>>>>> +{ > >>>>>>>>>> + struct kvm_vcpu *vcpu; > >>>>>>>>>> + int vector, i, pending_eoi = 0; > >>>>>>>>>> + > >>>>>>>>>> + if (RTC_GSI >= IOAPIC_NUM_PINS) > >>>>>>>>>> + return; > >>>>>>>>>> + > >>>>>>>>>> + vector = ioapic->redirtbl[RTC_GSI].fields.vector; > >>>>>>>>>> + kvm_for_each_vcpu(i, vcpu, ioapic->kvm) { > >>>>>>>>>> + if (kvm_apic_pending_eoi(vcpu, vector)) { > >>>>>>>>>> + pending_eoi++; > >>>>>>>>>> + __set_bit(vcpu->vcpu_id, > >>> ioapic->rtc_status.dest_map); > >>>>>>>>> You should cleat dest_map at the beginning to get rid of stale bits. > >>>>>>>> I thought kvm_set_ioapic is called only after save/restore or migration. > >>> And > >>>>> the > >>>>>>> ioapic should be reset successfully before call it. So the > >>>>>>> dest_map is empty before call rtc_irq_restore(). > >>>>>>>> But it is possible kvm_set_ioapic is called beside save/restore or > >>>>>>>> migration. Right? > >>>>>>>> > >>>>>>> First of all userspace should not care when it calls kvm_set_ioapic() > >>>>>>> the kernel need to do the right thing. Second, believe it or not, > >>>>>>> kvm_ioapic_reset() is not called during system reset. Instead userspace > >>>>>>> reset it by calling kvm_set_ioapic() with ioapic state after reset. > >>>>>> Ok. I see. As the logic you suggested, it will clear dest_map if no > >>>>>> pending eoi in vcpu, so we don't need to do it again. > >>>>>> > >>>>> You again rely on userspace doing thing in certain manner. What is > >>>>> set_lapic() is never called? Kernel internal state have to be correct > >>>>> after each ioctl call. > >>>> Sorry. I cannot figure out what's the problem if don't clear dest_map? > >>>> Can you elaborate it? > >>>> > >>> What is not obvious about it? If there is a bit in dest_map that should > >>> be cleared after rtc_irq_restore() it will not. > >> Why it will not? If new_val is false, and the old_val is true. > >> __clear_bit() will clear the dest_map. Am I wrong? > >> > > Ah, yes with new logic since we go over all vcpus and calculate new > > value for each one in theory it should be fine, but if we add cpu > > destruction this will be no longer true. > > > >> new_val = kvm_apic_pending_eoi(vcpu, vector); > > Which reminds me there are more bugs in the current code. It is not > > enough to call kvm_apic_pending_eoi() to check the new value. You need to > > see if the entry is masked and vcpu is the destination of the interrupt too. > No. kvm_apic_pending_eoi() is the right way. IOAPIC entry may change after vcpu received it before issue EOI, and we should not rely on the entry. > For example: > vcpu A received the interrupt, pending in IRR > mask entry > migration happened > > The only problem is we may account the interrupt from non-IOAPIC(from IPI) into RTC interrupt. But it's ok, we will clear pending_eoi in EOI regardless of source. > With apicv this EOI may never come. We have to at least check that the vcpu is a destination for the interrupt. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html