On Mon, Feb 24, 2025 at 8:55 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > Enable/disable local IRQs, i.e. set/clear RFLAGS.IF, in the common > svm_vcpu_enter_exit() just after/before guest_state_{enter,exit}_irqoff() > so that VMRUN is not executed in an STI shadow. AMD CPUs have a quirk > (some would say "bug"), where the STI shadow bleeds into the guest's > intr_state field if a #VMEXIT occurs during injection of an event, i.e. if > the VMRUN doesn't complete before the subsequent #VMEXIT. > > The spurious "interrupts masked" state is relatively benign, as it only > occurs during event injection and is transient. Because KVM is already > injecting an event, the guest can't be in HLT, and if KVM is querying IRQ > blocking for injection, then KVM would need to force an immediate exit > anyways since injecting multiple events is impossible. > > However, because KVM copies int_state verbatim from vmcb02 to vmcb12, the > spurious STI shadow is visible to L1 when running a nested VM, which can > trip sanity checks, e.g. in VMware's VMM. > > Hoist the STI+CLI all the way to C code, as the aforementioned calls to > guest_state_{enter,exit}_irqoff() already inform lockdep that IRQs are > enabled/disabled, and taking a fault on VMRUN with RFLAGS.IF=1 is already > possible. I.e. if there's kernel code that is confused by running with > RFLAGS.IF=1, then it's already a problem. In practice, since GIF=0 also > blocks NMIs, the only change in exposure to non-KVM code (relative to > surrounding VMRUN with STI+CLI) is exception handling code, and except for > the kvm_rebooting=1 case, all exception in the core VM-Enter/VM-Exit path > are fatal. > > Use the "raw" variants to enable/disable IRQs to avoid tracing in the > "no instrumentation" code; the guest state helpers also take care of > tracing IRQ state. > > Oppurtunstically document why KVM needs to do STI in the first place. > > Reported-by: Doug Covelli <doug.covelli@xxxxxxxxxxxx> > Closes: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4uzJQ@xxxxxxxxxxxxxx > Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> Reviewed-by: Jim Mattson <jmattson@xxxxxxxxxx>