On Wed, Jul 08, 2009 at 10:17:21AM -0300, Marcelo Tosatti wrote: > On Wed, Jul 08, 2009 at 03:58:19PM +0300, Gleb Natapov wrote: > > Excellent patch series. > > > > On Sun, Jul 05, 2009 at 10:55:15PM -0300, Marcelo Tosatti wrote: > > > int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) > > > { > > > - int ret; > > > + ktime_t now, expires; > > > > > > - ret = pit_has_pending_timer(vcpu); > > > - ret |= apic_has_pending_timer(vcpu); > > > + expires = kvm_vcpu_next_timer_event(vcpu); > > > + now = ktime_get(); > > > + if (expires.tv64 <= now.tv64) { > > > + if (kvm_arch_interrupt_allowed(vcpu)) > > > + set_bit(KVM_REQ_UNHALT, &vcpu->requests); > > You shouldn't unhalt vcpu here. Not every timer event will generate > > interrupt (vector can be masked in pic/ioapic) > > Yeah. Note however that kvm_vcpu_next_timer_event only returns the > expiration time for events that have been acked (for timers that have > had events injected, but not acked it returns KTIME_MAX). > > So, the code above will set one spurious unhalt if: > > - inject timer irq > - guest acks irq > - guest mask irq > - unhalt (once) > > I had a "kvm_timer_mask" callback before (along with the attached > patch), but decided to keep it simpler using the ack trick above. > > I suppose one spurious unhalt is harmless, or is it a correctness issue? > This is correctness issue. We should be as close to real CPU as possible. This will save us may hours of debugging later :) > > or timer event can generate NMI instead of interrupt. > > In the NMI case it should not unhalt the processor? Why? It should. It should jump to NMI handler. > > (but yes, bypassing the irq injection system its not a very beatiful > shortcut, but its done in other places too eg i8254.c NMI injection via > all cpus LINT0). > > > Leaving this code out probably means > > that you can't remove kvm_inject_pending_timer_irqs() call from > > __vcpu_run(). > > > > > + return 1; > > > + } > > > > > > - return ret; > > > + return 0; > > > } > > > EXPORT_SYMBOL(kvm_cpu_has_pending_timer); > > > > > > > -- > > Gleb. > Index: kvm/arch/x86/kvm/i8259.c > =================================================================== > --- kvm.orig/arch/x86/kvm/i8259.c > +++ kvm/arch/x86/kvm/i8259.c > @@ -122,6 +122,18 @@ static inline int get_priority(struct kv > return priority; > } > > +int pic_int_is_masked(struct kvm *kvm, int irq) > +{ > + u8 imr; > + struct kvm_pic *s = pic_irqchip(kvm); > + > + pic_lock(s); > + imr = s->pics[SELECT_PIC(irq)].imr; > + pic_unlock(s); > + > + return (imr & (1 << irq)); > +} > + > /* > * return the pic wanted interrupt. return -1 if none > */ > Index: kvm/arch/x86/kvm/irq.c > =================================================================== > --- kvm.orig/arch/x86/kvm/irq.c > +++ kvm/arch/x86/kvm/irq.c > @@ -42,6 +42,29 @@ int kvm_cpu_has_pending_timer(struct kvm > EXPORT_SYMBOL(kvm_cpu_has_pending_timer); > > /* > + * The PIT interrupt can be masked on different levels: PIC/IOAPIC/LAPIC. > + * Hide the maze from the PIT mask notifier, all it cares about is whether > + * the interrupt can reach the processor. > + */ > +void kvm_pit_mask_notifier(struct kvm_irq_mask_notifier *kimn, bool mask) > +{ > + bool masked = 0; > + struct kvm *kvm; > + struct kvm_pit *pit = container_of(kimn, struct kvm_pit, mask_notifier); > + > + kvm = pit->kvm; > + > + /* LAPIC hardware disabled or INTIN0 programmed as ExtINT */ > + if (kvm_apic_accept_pic_intr(kvm->vcpus[0])) { > + masked = pic_int_is_masked(kvm, 0); > + /* IOAPIC -> LAPIC */ > + } else > + masked = ioapic_int_is_masked(kvm, 0); > + > + kvm_pit_internal_mask_notifier(kvm, mask); > +} > + > +/* > * check if there is pending interrupt without > * intack. > */ > Index: kvm/virt/kvm/ioapic.c > =================================================================== > --- kvm.orig/virt/kvm/ioapic.c > +++ kvm/virt/kvm/ioapic.c > @@ -101,6 +101,16 @@ static int ioapic_service(struct kvm_ioa > return injected; > } > > +int ioapic_int_is_masked(struct kvm *kvm, int pin) > +{ > + struct kvm_ioapic *ioapic = kvm->arch.vioapic; > + > + if (!ioapic) > + return 1; > + > + return ioapic->redirtbl[pin].fields.mask; > +} > + > static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) > { > unsigned index; > Index: kvm/arch/x86/kvm/i8254.c > =================================================================== > --- kvm.orig/arch/x86/kvm/i8254.c > +++ kvm/arch/x86/kvm/i8254.c > @@ -558,9 +558,9 @@ void kvm_pit_reset(struct kvm_pit *pit) > pit->pit_state.irq_ack = 1; > } > > -static void pit_mask_notifer(struct kvm_irq_mask_notifier *kimn, bool mask) > +void kvm_pit_internal_mask_notifier(struct kvm *kvm, bool mask) > { > - struct kvm_pit *pit = container_of(kimn, struct kvm_pit, mask_notifier); > + struct kvm_pit *pit = kvm->arch.vpit; > > if (!mask) { > atomic_set(&pit->pit_state.pit_timer.pending, 0); > @@ -614,8 +614,8 @@ struct kvm_pit *kvm_create_pit(struct kv > > kvm_pit_reset(pit); > > - pit->mask_notifier.func = pit_mask_notifer; > - kvm_register_irq_mask_notifier(kvm, 0, &pit->mask_notifier); > + pit->mask_notifier.func = kvm_pit_mask_notifier; > + kvm_register_irq_mask_notifier(kvm, 0, &kvm->arch.vpit->mask_notifier); > > kvm_iodevice_init(&pit->dev, &pit_dev_ops); > kvm_io_bus_register_dev(&kvm->pio_bus, &pit->dev); > Index: kvm/arch/x86/kvm/irq.h > =================================================================== > --- kvm.orig/arch/x86/kvm/irq.h > +++ kvm/arch/x86/kvm/irq.h > @@ -98,7 +98,13 @@ void __kvm_migrate_apic_timer(struct kvm > void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); > void __kvm_migrate_timers(struct kvm_vcpu *vcpu); > > +void kvm_pit_internal_mask_notifier(struct kvm *kvm, bool mask); > +void kvm_pit_mask_notifier(struct kvm_irq_mask_notifier *kimn, bool mask); > + > int pit_has_pending_timer(struct kvm_vcpu *vcpu); > int apic_has_pending_timer(struct kvm_vcpu *vcpu); > > +int pic_int_is_masked(struct kvm *kvm, int irq); > +int ioapic_int_is_masked(struct kvm *kvm, int irq); > + > #endif -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html