On Thu, Nov 07, 2024, Chao Gao wrote: > On Wed, Nov 06, 2024 at 05:54:19AM -0800, Sean Christopherson wrote: > >On Wed, Nov 06, 2024, Chao Gao wrote: > >> >Furthermore, in addition to introducing this issue, commit 755c2bf87860 also > >> >papered over the underlying bug: KVM doesn't ensure CPUs and devices see APICv > >> >as disabled prior to searching the IRR. Waiting until KVM emulates EOI to update > >> >irr_pending works because KVM won't emulate EOI until after refresh_apicv_exec_ctrl(), > >> >and because there are plenty of memory barries in between, but leaving irr_pending > >> >set is basically hacking around bad ordering, which I _think_ can be fixed by: > >> > > >> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > >> >index 83fe0a78146f..85d330b56c7e 100644 > >> >--- a/arch/x86/kvm/x86.c > >> >+++ b/arch/x86/kvm/x86.c > >> >@@ -10548,8 +10548,8 @@ void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu) > >> > goto out; > >> > > >> > apic->apicv_active = activate; > >> >- kvm_apic_update_apicv(vcpu); > >> > kvm_x86_call(refresh_apicv_exec_ctrl)(vcpu); > >> >+ kvm_apic_update_apicv(vcpu); > >> > >> I may miss something important. how does this change ensure CPUs and devices see > >> APICv as disabled (thus won't manipulate the vCPU's IRR)? Other CPUs when > >> performing IPI virtualization just looks up the PID_table while IOMMU looks up > >> the IRTE table. ->refresh_apicv_exec_ctrl() doesn't change any of them. > > > >For Intel, which is a bug (one of many in this area). AMD does update both. The > >failure Maxim was addressing was on AMD (AVIC), which has many more scenarios where > >it needs to be inhibited/disabled. > > Yes indeed. Actually the commit below fixes the bug for Intel already. Just the > approach isn't to let other CPUs and devices see APICv disabled. Instead, pick > up all pending IRQs (in PIR) before VM-entry and cancel VM-entry if needed. > > 1 commit 7e1901f6c86c896acff6609e0176f93f756d8b2a > 2 Author: Paolo Bonzini <pbonzini@xxxxxxxxxx> > 3 Date: Mon Nov 22 19:43:09 2021 -0500 > 4 > 5 KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled > 6 > 7 If APICv is disabled for this vCPU, assigned devices may still attempt to > 8 post interrupts. In that case, we need to cancel the vmentry and deliver > 9 the interrupt with KVM_REQ_EVENT. Extend the existing code that handles > 10 injection of L1 interrupts into L2 to cover this case as well. > 11 > 12 vmx_hwapic_irr_update is only called when APICv is active so it would be > 13 confusing to add a check for vcpu->arch.apicv_active in there. Instead, > 14 just use vmx_set_rvi directly in vmx_sync_pir_to_irr. Ah, right, and that approach works because the posted interrupt notification IRQ is guaranteed to cause a VM-Exit, and KVM keeps the destination CPU in the PID up-to-date even if APICv is inhibited. But on AMD, the GA log interrupt is per-IOMMU and so isn't affined to the CPU on which the vCPU that generated that log entry is running, i.e. won't force an exit on the destination. Oh, and the vCPU's entry in the IPI virtualization table needs to be marked as not-running so that the sender is forced to exit and kick the target. In theory, kicking the target vCPU in avic_ga_log_notifier() would allow keeping the associated IRTEs in guest/posted mode. I'm mildly curious if that would yield better or worse performance/latency than going through the per-IRQ handler.