On 10/19/2015 03:14 PM, Christoffer Dall wrote: > On Mon, Oct 19, 2015 at 03:07:16PM +0200, Eric Auger wrote: >> Hi Christoffer, >> On 10/17/2015 10:30 PM, Christoffer Dall wrote: >>> We have an interesting issue when the guest disables the timer interrupt >>> on the VGIC, which happens when turning VCPUs off using PSCI, for >>> example. >>> >>> The problem is that because the guest disables the virtual interrupt at >>> the VGIC level, we never inject interrupts to the guest and therefore >>> never mark the interrupt as active on the physical distributor. The >>> host also never takes the timer interrupt (we only use the timer device >>> to trigger a guest exit and everything else is done in software), so the >>> interrupt does not become active through normal means. >>> >>> The result is that we keep entering the guest with a programmed timer >>> that will always fire as soon as we context switch the hardware timer >>> state and run the guest, preventing forward progress for the VCPU. >>> >>> Since the active state on the physical distributor is really part of the >>> timer logic, it is the job of our virtual arch timer driver to manage >>> this state. >>> >>> The timer->map->active boolean field indicates whether we have signalled >>> this interrupt to the vgic and if that interrupt is still pending or >>> active. As long as that is the case, the hardware doesn't have to >>> generate physical interrupts and therefore we mark the interrupt as >>> active on the physical distributor. >>> >>> Cc: Marc Zyngier <marc.zyngier@xxxxxxx> >>> Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> >>> Signed-off-by: Christoffer Dall <christoffer.dall@xxxxxxxxxx> >>> --- >>> virt/kvm/arm/arch_timer.c | 19 +++++++++++++++++++ >>> virt/kvm/arm/vgic.c | 43 +++++++++++-------------------------------- >>> 2 files changed, 30 insertions(+), 32 deletions(-) >>> >>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c >>> index 48c6e1a..b9d3a32 100644 >>> --- a/virt/kvm/arm/arch_timer.c >>> +++ b/virt/kvm/arm/arch_timer.c >>> @@ -137,6 +137,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu) >>> void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu) >>> { >>> struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; >>> + bool phys_active; >>> + int ret; >>> >>> /* >>> * We're about to run this vcpu again, so there is no need to >>> @@ -151,6 +153,23 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu) >>> */ >>> if (kvm_timer_should_fire(vcpu)) >>> kvm_timer_inject_irq(vcpu); >>> + >>> + /* >>> + * We keep track of whether the edge-triggered interrupt has been >>> + * signalled to the vgic/guest, and if so, we mask the interrupt and >>> + * the physical distributor to prevent the timer from raising a >>> + * physical interrupt whenever we run a guest, preventing forward >>> + * VCPU progress. >> In practice don't you simply mark the IRQ as active at GIC physical >> distributor level, hence preventing the same IRQ from hitting again > > yes, that's what I meant with my comment, I should reword to "...we mark > the interrupt as active on the physical distributor..." > >>> + */ >>> + if (kvm_vgic_get_phys_irq_active(timer->map)) >>> + phys_active = true; >>> + else >>> + phys_active = false; >>> + >>> + ret = irq_set_irqchip_state(timer->map->irq, >>> + IRQCHIP_STATE_ACTIVE, >>> + phys_active); >> >> physical distributor state is set in arch timer flush. It relates to a >> shared device behavior so I find it natural to do it there. >> >> However the map->active is set in arch_timer IRQ injection and unset in >> vgic sync. Why not doing the set in kvm_vgic_inject_mapped_irq? > > Because you have to set it at every entry to the guest if you run > multiple VCPUs/VMs on this CPU or migrate this VCPU to a different CPU. I meant kvm_vgic_set_phys_irq_active(timer->map, true) call in kvm_timer_inject_irq? Couldn' that been done in kvm_vgic_inject_mapped_irq instead. Doesn't this apply to all mapped IRQs? Eric > >> >>> + WARN_ON(ret); >>> } >>> >>> /** >>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c >>> index 596455a..ea21bc2 100644 >>> --- a/virt/kvm/arm/vgic.c >>> +++ b/virt/kvm/arm/vgic.c >>> @@ -1092,6 +1092,15 @@ static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu) >>> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >>> struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr); >>> >>> + /* >>> + * We must transfer the pending state back to the distributor before >>> + * retiring the LR, otherwise we may loose edge-triggered interrupts. >>> + */ >>> + if (vlr.state & LR_STATE_PENDING) { >>> + vgic_dist_irq_set_pending(vcpu, irq); >>> + vlr.hwirq = 0; >>> + } >> That fix applies to any edge-sensitive IRQ, ie. not especially the >> timer's one? In the positive shouldn't you precise this in the commit >> msg too? >> > > Probably, it could also be a separate patch. I'll rework this. > > Thanks, > -Christoffer > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html