I've been trying to run VMs on a GICv3-based system that offers the GICv2 compatibility feature, and noticed that they would tend to slowly die under load, or even without load. It turned out that this is due to KVM not being exactly true to the architecture, and ends up injecting multiple SGI with the same vintid, which the architecture clearly outlines as a "don't do that". This bug has been there since the first days of the "new vgic". This also affects GICv2, but for some reason GIC-400 seems quite tolerant, and GIC-500 much less so. The fix is a bit tortuous, as we must ensure that we never allow interrupts of lesser priority to be queued before all the pending multi-source SGIs are injected (I'd be happy to provide beer to whoever writes a proper unit test for that one). Another issue is that we don't use the right barriers when exiting from the guest, as we only synchronize stores, while the architecture requires to synchronize both loads and stores. And we miss an isb to force execution of the previous dsb. - From v1: - Reworked patch #1 after much discussions with Christoffer. Marc Zyngier (2): KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3 include/linux/irqchip/arm-gic-v3.h | 1 + include/linux/irqchip/arm-gic.h | 1 + virt/kvm/arm/hyp/vgic-v3-sr.c | 3 +- virt/kvm/arm/vgic/vgic-v2.c | 9 +++++- virt/kvm/arm/vgic/vgic-v3.c | 9 +++++- virt/kvm/arm/vgic/vgic.c | 61 +++++++++++++++++++++++++++++--------- virt/kvm/arm/vgic/vgic.h | 2 ++ 7 files changed, 69 insertions(+), 17 deletions(-) -- 2.14.2