On Sun, 10 Nov 2019 14:29:14 +0000 Marc Zyngier <maz@xxxxxxxxxx> wrote: Hi Marc, > On Fri, 8 Nov 2019 17:49:51 +0000 > Andre Przywara <andre.przywara@xxxxxxx> wrote: > > > Our current VGIC emulation code treats the "EnableGrpX" bits in GICD_CTLR > > as a single global interrupt delivery switch, where in fact the GIC > > architecture asks for this being separate for the two interrupt groups. > > > > To implement this properly, we have to slightly adjust our design, to > > *not* let IRQs from a disabled interrupt group be added to the ap_list. > > > > As a consequence, enabling one group requires us to re-evaluate every > > pending IRQ and potentially add it to its respective ap_list. Similarly > > disabling an interrupt group requires pending IRQs to be removed from > > the ap_list (as long as they have not been activated yet). > > > > Implement a rather simple, yet not terribly efficient algorithm to > > achieve this: For each VCPU we iterate over all IRQs, checking for > > pending ones and adding them to the list. We hold the ap_list_lock > > for this, to make this atomic from a VCPU's point of view. > > > > When an interrupt group gets disabled, we can't directly remove affected > > IRQs from the ap_list, as a running VCPU might have already activated > > them, which wouldn't be immediately visible to the host. > > Instead simply kick all VCPUs, so that they clean their ap_list's > > automatically when running vgic_prune_ap_list(). > > > > Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx> > > --- > > virt/kvm/arm/vgic/vgic.c | 88 ++++++++++++++++++++++++++++++++++++---- > > 1 file changed, 80 insertions(+), 8 deletions(-) > > > > diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c > > index 3b88e14d239f..28d9ff282017 100644 > > --- a/virt/kvm/arm/vgic/vgic.c > > +++ b/virt/kvm/arm/vgic/vgic.c > > @@ -339,6 +339,38 @@ int vgic_dist_enable_group(struct kvm *kvm, int group, bool status) > > return 0; > > } > > > > +/* > > + * Check whether a given IRQs need to be queued to this ap_list, and do > > + * so if that's the case. > > + * Requires the ap_list_lock to be held (but not the irq lock). > > + * > > + * Returns 1 if that IRQ has been added to the ap_list, and 0 if not. > > + */ > > +static int queue_enabled_irq(struct kvm *kvm, struct kvm_vcpu *vcpu, > > + int intid) > > true/false seems better than 1/0. Mmh, indeed. I think I had more in there in an earlier version. > > +{ > > + struct vgic_irq *irq = vgic_get_irq(kvm, vcpu, intid); > > + int ret = 0; > > + > > + raw_spin_lock(&irq->irq_lock); > > + if (!irq->vcpu && vcpu == vgic_target_oracle(irq)) { > > + /* > > + * Grab a reference to the irq to reflect the > > + * fact that it is now in the ap_list. > > + */ > > + vgic_get_irq_kref(irq); > > + list_add_tail(&irq->ap_list, > > + &vcpu->arch.vgic_cpu.ap_list_head); > > Two things: > - This should be the job of vgic_queue_irq_unlock. Why are you > open-coding it? I was *really* keen on reusing that, but couldn't for two reasons: a) the locking code inside vgic_queue_irq_unlock spoils that: It requires the irq_lock to be held, but not the ap_list_lock. Then it takes both locks, but returns with both of them dropped. We need to hold the ap_list_lock all of the time, to prevent any VCPU returning to the HV to interfere with this routine. b) vgic_queue_irq_unlock() kicks the VCPU already, where I want to just add all of them first, then kick the VCPU at the end. So I decided to go with the stripped-down version of it, because I didn't dare to touch the original function. I could refactor this "actually add to the list" part of vgic_queue_irq_unlock() into this new function, then call it from both vgic_queue_irq_unlock() and from the new users. > - What if the interrupt isn't pending? Non-pending, non-active > interrupts should not be on the AP list! That should be covered by vgic_target_oracle() already, shouldn't it? > > + irq->vcpu = vcpu; > > + > > + ret = 1; > > + } > > + raw_spin_unlock(&irq->irq_lock); > > + vgic_put_irq(kvm, irq); > > + > > + return ret; > > +} > > + > > /* > > * The group enable status of at least one of the groups has changed. > > * If enabled is true, at least one of the groups got enabled. > > @@ -346,17 +378,57 @@ int vgic_dist_enable_group(struct kvm *kvm, int group, bool status) > > */ > > void vgic_rescan_pending_irqs(struct kvm *kvm, bool enabled) > > { > > + int cpuid; > > + struct kvm_vcpu *vcpu; > > + > > /* > > - * TODO: actually scan *all* IRQs of the VM for pending IRQs. > > - * If a pending IRQ's group is now enabled, add it to its ap_list. > > - * If a pending IRQ's group is now disabled, kick the VCPU to > > - * let it remove this IRQ from its ap_list. We have to let the > > - * VCPU do it itself, because we can't know the exact state of an > > - * IRQ pending on a running VCPU. > > + * If no group got enabled, we only have to potentially remove > > + * interrupts from ap_lists. We can't do this here, because a running > > + * VCPU might have ACKed an IRQ already, which wouldn't immediately > > + * be reflected in the ap_list. > > + * So kick all VCPUs, which will let them re-evaluate their ap_lists > > + * by running vgic_prune_ap_list(), removing no longer enabled > > + * IRQs. > > + */ > > + if (!enabled) { > > + vgic_kick_vcpus(kvm); > > + > > + return; > > + } > > + > > + /* > > + * At least one group went from disabled to enabled. Now we need > > + * to scan *all* IRQs of the VM for newly group-enabled IRQs. > > + * If a pending IRQ's group is now enabled, add it to the ap_list. > > + * > > + * For each VCPU this needs to be atomic, as we need *all* newly > > + * enabled IRQs in be in the ap_list to determine the highest > > + * priority one. > > + * So grab the ap_list_lock, then iterate over all private IRQs and > > + * all SPIs. Once the ap_list is updated, kick that VCPU to > > + * forward any new IRQs to the guest. > > */ > > + kvm_for_each_vcpu(cpuid, vcpu, kvm) { > > + unsigned long flags; > > + int i; > > > > - /* For now just kick all VCPUs, as the old code did. */ > > - vgic_kick_vcpus(kvm); > > + raw_spin_lock_irqsave(&vcpu->arch.vgic_cpu.ap_list_lock, flags); > > + > > + for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) > > + queue_enabled_irq(kvm, vcpu, i); > > + > > + for (i = VGIC_NR_PRIVATE_IRQS; > > + i < kvm->arch.vgic.nr_spis + VGIC_NR_PRIVATE_IRQS; i++) > > + queue_enabled_irq(kvm, vcpu, i); > > On top of my questions above, what happens to LPIs? Oh dear. Looks like wishful thinking on my side ;-) Iterating over all interrupts is probably not a good idea anymore. Do you think this idea of having a list with group-disabled IRQs is a better approach: In vgic_queue_irq_unlock, if a pending IRQ's group is enabled, it goes into the ap_list, if not, it goes into another list instead. Then we would only need to consult this other list when a group gets enabled. Both lists protected by the same ap_list_lock. Does that make sense? > And if a group has > been disabled, how do you retire these interrupts from the AP list? This is done above: we kick the respective VCPU and rely on vgic_prune_ap_list() to remove them (that uses vgic_target_oracle(), which in turn checks vgic_irq_is_grp_enabled()). Cheers, Andre. > > + > > + raw_spin_unlock_irqrestore(&vcpu->arch.vgic_cpu.ap_list_lock, > > + flags); > > + > > + if (kvm_vgic_vcpu_pending_irq(vcpu)) { > > + kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu); > > + kvm_vcpu_kick(vcpu); > > + } > > + } > > } > > > > bool vgic_dist_group_enabled(struct kvm *kvm, int group) > > Thanks, > > M.