On Wed, Aug 30, 2017 at 10:19:38AM +0200, Auger Eric wrote: > Hi Christoffer, > > On 29/08/2017 11:39, Christoffer Dall wrote: > > Level-triggered mapped IRQs are special because we only observe rising > > edges as input to the VGIC, and we don't set the EOI flag and therefore > > are not told when the level goes down, so that we can re-queue a new > > interrupt when the level goes up. > > > > One way to solve this problem is to side-step the logic of the VGIC and > > special case the validation in the injection path, but it has the > > unfortunate drawback of having to peak into the physical GIC state > > whenever we want to know if the interrupt is pending on the virtual > > distributor. > > > > Instead, we can maintain the current semantics of a level triggered > > interrupt by sort of treating it as an edge-triggered interrupt, > > following from the fact that we only observe an asserting edge. This > > requires us to be a bit careful when populating the LRs and when folding > > the state back in though: > > > > * We lower the line level when populating the LR, so that when > > subsequently observing an asserting edge, the VGIC will do the right > > thing. > > > > * If the guest never acked the interrupt while running (for example if > > it had masked interrupts at the CPU level while running), we have > > to preserve the pending state of the LR and move it back to the > > line_level field of the struct irq when folding LR state. > > > > If the guest never acked the interrupt while running, but changed the > > device state and lowered the line (again with interrupts masked) then > > we need to observe this change in the line_level. > > > > Both of the above situations are solved by sampling the physical line > > and set the line level when folding the LR back. > > > > * Finally, if the guest never acked the interrupt while running and > > sampling the line reveals that the device state has changed and the > > line has been lowered, we must clear the physical active state, since > > we will otherwise never be told when the interrupt becomes asserted > > again. > > > > This has the added benefit of making the timer optimization patches > > (https://lists.cs.columbia.edu/pipermail/kvmarm/2017-July/026343.html) a > > bit simpler, because the timer code doesn't have to clear the active > > state on the sync anymore. It also potentially improves the performance > > of the timer implementation because the GIC knows the state or the LR > > and only needs to clear the > > active state when the pending bit in the LR is still set, where the > > timer has to always clear it when returning from running the guest with > > an injected timer interrupt. > > > > Signed-off-by: Christoffer Dall <cdall@xxxxxxxxxx> > > --- > > virt/kvm/arm/vgic/vgic-v2.c | 29 +++++++++++++++++++++++++++++ > > virt/kvm/arm/vgic/vgic-v3.c | 29 +++++++++++++++++++++++++++++ > > virt/kvm/arm/vgic/vgic.c | 23 +++++++++++++++++++++++ > > virt/kvm/arm/vgic/vgic.h | 7 +++++++ > > 4 files changed, 88 insertions(+) > > > > diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c > > index e4187e5..f7c5cb5 100644 > > --- a/virt/kvm/arm/vgic/vgic-v2.c > > +++ b/virt/kvm/arm/vgic/vgic-v2.c > > @@ -104,6 +104,26 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu) > > irq->pending_latch = false; > > } > > > > + /* > > + * Level-triggered mapped IRQs are special because we only > > + * observe rising edges as input to the VGIC. > > + * > > + * If the guest never acked the interrupt we have to sample > > + * the physical line and set the line level, because the > > + * device state could have changed or we simply need to > > + * process the still pending interrupt later. > > + * > > + * If this causes us to lower the level, we have to also clear > > + * the physical active state, since we will otherwise never be > > + * told when the interrupt becomes asserted again. > > + */ > > + if (vgic_irq_is_mapped_level(irq) && (val & GICH_LR_PENDING_BIT)) { > > + irq->line_level = vgic_irq_line_level(irq); > > + > > + if (!irq->line_level) > > + vgic_irq_clear_phys_active(irq); > > + } > > + > > spin_unlock(&irq->irq_lock); > > vgic_put_irq(vcpu->kvm, irq); > > } > > @@ -161,6 +181,15 @@ void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr) > > val |= GICH_LR_EOI; > > } > > > > + /* > > + * Level-triggered mapped IRQs are special because we only observe > > + * rising edges as input to the VGIC. We therefore lower the line > > + * level here, so that we can take new virtual IRQs. See > > + * vgic_v2_fold_lr_state for more info. > > + */ > > + if (vgic_irq_is_mapped_level(irq) && (val & GICH_LR_PENDING_BIT)) > > + irq->line_level = false; > > + > > /* The GICv2 LR only holds five bits of priority. */ > > val |= (irq->priority >> 3) << GICH_LR_PRIORITY_SHIFT; > > > > diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c > > index 96ea597..e377036 100644 > > --- a/virt/kvm/arm/vgic/vgic-v3.c > > +++ b/virt/kvm/arm/vgic/vgic-v3.c > > @@ -94,6 +94,26 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu) > > irq->pending_latch = false; > > } > > > > + /* > > + * Level-triggered mapped IRQs are special because we only > > + * observe rising edges as input to the VGIC. > > + * > > + * If the guest never acked the interrupt we have to sample > > + * the physical line and set the line level, because the > > + * device state could have changed or we simply need to > > + * process the still pending interrupt later. > > + * > > + * If this causes us to lower the level, we have to also clear > > + * the physical active state, since we will otherwise never be > > + * told when the interrupt becomes asserted again. > > + */ > > + if (vgic_irq_is_mapped_level(irq) && (val & ICH_LR_PENDING_BIT)) { > > + irq->line_level = vgic_irq_line_level(irq); > > I don't see anything related to the GICD_ISPENDING/ICPENDING. vgic_irq_line_level() reads GICD_ISPEND. Not sure what you mean? > In the > last thread, we said we were obliged to set the pending bit on the > physical distributor to avoid the guest deactivating a non active > physical IRQ. Did we? I don't remember this, and this doesn't make sense to me. The only constraint I think we have is that when setting the HW bit in the LR, the interrupt must be active on the physical side, so that a deactivate by the guest is also a deactivate on the physical side. > If we still intend to do so, with that addition, aren't we > likely to deactivate the physical IRQ before the guest? Do you mean if it's pending+active in the LR? That can never happen for a mapped (HW bit set) interrupt. So this particular code sequence means that (val & ICH_LR_STATE) == ICH_LR_PENDING_BIT and we never clear the physical active state while the virtual interrupt is active. There is an interesting point in terms of handling guest accesses to the virtual distributor GICD_ISPENDR, GICD_ICPENDR, GICD_ISACTIVER, and GICD_ICACTIVER where we must also propogate those changes to the physical side. Was this what you meant? Thanks, -Christoffer > > + > > + if (!irq->line_level) > > + vgic_irq_clear_phys_active(irq); > > + } > > + > > spin_unlock(&irq->irq_lock); > > vgic_put_irq(vcpu->kvm, irq); > > } > > @@ -144,6 +164,15 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr) > > } > > > > /* > > + * Level-triggered mapped IRQs are special because we only observe > > + * rising edges as input to the VGIC. We therefore lower the line > > + * level here, so that we can take new virtual IRQs. See > > + * vgic_v3_fold_lr_state for more info. > > + */ > > + if (vgic_irq_is_mapped_level(irq) && (val & ICH_LR_PENDING_BIT)) > > + irq->line_level = false; > > + > > + /* > > * We currently only support Group1 interrupts, which is a > > * known defect. This needs to be addressed at some point. > > */ > > diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c > > index 9d557efd..2691a0a 100644 > > --- a/virt/kvm/arm/vgic/vgic.c > > +++ b/virt/kvm/arm/vgic/vgic.c > > @@ -140,6 +140,29 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq) > > kfree(irq); > > } > > > > +/* Get the input level of a mapped IRQ directly from the physical GIC */ > > +bool vgic_irq_line_level(struct vgic_irq *irq) > > +{ > > + bool line_level; > > + > > + BUG_ON(!irq->hw); > > + > > + WARN_ON(irq_get_irqchip_state(irq->host_irq, > > + IRQCHIP_STATE_PENDING, > > + &line_level)); > > + return line_level; > > +} > > + > > +/* Clear the physical active state */ > > +void vgic_irq_clear_phys_active(struct vgic_irq *irq) > > +{ > > + > > + BUG_ON(!irq->hw); > > + WARN_ON(irq_set_irqchip_state(irq->host_irq, > > + IRQCHIP_STATE_ACTIVE, > > + false)); > > +} > > + > > /** > > * kvm_vgic_target_oracle - compute the target vcpu for an irq > > * > > diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h > > index bba7fa2..22f106d 100644 > > --- a/virt/kvm/arm/vgic/vgic.h > > +++ b/virt/kvm/arm/vgic/vgic.h > > @@ -104,6 +104,11 @@ static inline bool irq_is_pending(struct vgic_irq *irq) > > return irq->pending_latch || irq->line_level; > > } > > > > +static inline bool vgic_irq_is_mapped_level(struct vgic_irq *irq) > > +{ > > + return irq->config == VGIC_CONFIG_LEVEL && irq->hw; > > +} > > + > > /* > > * This struct provides an intermediate representation of the fields contained > > * in the GICH_VMCR and ICH_VMCR registers, such that code exporting the GIC > > @@ -140,6 +145,8 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev, > > struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu, > > u32 intid); > > void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq); > > +bool vgic_irq_line_level(struct vgic_irq *irq); > > +void vgic_irq_clear_phys_active(struct vgic_irq *irq); > > bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq); > > void vgic_kick_vcpus(struct kvm *kvm); > > > >