[PATCH v5 00/27] KVM: x86: Event/exception fixes and cleanups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Paolo and others, please take a look at the proposed changes to
force_emulation_prefix, especially the last patch which makes it writable
while KVM is running.  I'm pretty confident that's safe, but definitely
want confirmation that I'm not overlooking something.

The main goal of this series is to fix KVM's longstanding bug of not
honoring L1's exception intercepts wants when handling an exception that
occurs during delivery of a different exception.  E.g. if L0 and L1 are
using shadow paging, and L2 hits a #PF, and then hits another #PF while
vectoring the first #PF due to _L1_ not having a shadow page for the IDT,
KVM needs to check L1's intercepts before morphing the #PF => #PF => #DF
so that the #PF is routed to L1, not injected into L2 as a #DF.

nVMX has hacked around the bug for years by overriding the #PF injector
for shadow paging to go straight to VM-Exit, and nSVM has started doing
the same.  The hacks mostly work, but they're incomplete, confusing, and
lead to other hacky code, e.g. bailing from the emulator because #PF
injection forced a VM-Exit and suddenly KVM is back in L1.

v5:
 - Collect reviews. [Maxim]
 - Check code breakpoints on forced emulation (FEP #UD).  Hardware checks
   the RIP of the magic prefix, KVM needs to check the RIP of the insn.
 - Suppress code #DBs on Intel vCPUs if MOV/POP-SS blocking is active.
 - Extend the forced emulation to allow clearing RFLAGS.RF (Intel sets
   it unconditionaly on VM-Exit(#UD), which makes it impossibled to use
   forced emulation to test code #DBs.
 - Allow writing force_emulation_prefix while KVM is running.

v4:
 - https://lore.kernel.org/all/20220723005137.1649592-1-seanjc@xxxxxxxxxx
 - Collect reviews. [Maxim]
 - Fix a bug where an intermediate patch dropped the async #PF token
   and used a stale payload. [Maxim]
 - Tweak comments to call out that AMD CPUs generate error codes with
   bits 31:16 != 0. [Maxim]

v3:
 - https://lore.kernel.org/all/20220715204226.3655170-1-seanjc@xxxxxxxxxx
 - Collect reviews. [Maxim, Jim]
 - Split a few patches into more consumable chunks. [Maxim]
 - Document that KVM doesn't correctly handle SMI+MTF (or SMI priority). [Maxim]
 - Add comment to document the instruction boundary (event window) aspect
   of block_nested_events. [Maxim]
 - Add a patch to rename inject_pending_events() and add a comment to
   document KVM's not-quite-architecturally-correct handing of instruction
   boundaries and asynchronous events. [Maxim]

v2:
  - https://lore.kernel.org/all/20220614204730.3359543-1-seanjc@xxxxxxxxxx
  - Rebased to kvm/queue (commit 8baacf67c76c) + selftests CPUID
    overhaul.
    https://lore.kernel.org/all/20220614200707.3315957-1-seanjc@xxxxxxxxxx
  - Treat KVM_REQ_TRIPLE_FAULT as a pending exception.

v1: https://lore.kernel.org/all/20220614204730.3359543-1-seanjc@xxxxxxxxxx


Sean Christopherson (27):
  KVM: nVMX: Unconditionally purge queued/injected events on nested
    "exit"
  KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS
  KVM: x86: Don't check for code breakpoints when emulating on exception
  KVM: x86: Allow clearing RFLAGS.RF on forced emulation to test code
    #DBs
  KVM: x86: Suppress code #DBs on Intel if MOV/POP SS blocking is active
  KVM: nVMX: Treat General Detect #DB (DR7.GD=1) as fault-like
  KVM: nVMX: Prioritize TSS T-flag #DBs over Monitor Trap Flag
  KVM: x86: Treat #DBs from the emulator as fault-like (code and
    DR7.GD=1)
  KVM: x86: Use DR7_GD macro instead of open coding check in emulator
  KVM: nVMX: Ignore SIPI that arrives in L2 when vCPU is not in WFS
  KVM: nVMX: Unconditionally clear mtf_pending on nested VM-Exit
  KVM: VMX: Inject #PF on ENCLS as "emulated" #PF
  KVM: x86: Rename kvm_x86_ops.queue_exception to inject_exception
  KVM: x86: Make kvm_queued_exception a properly named, visible struct
  KVM: x86: Formalize blocking of nested pending exceptions
  KVM: x86: Use kvm_queue_exception_e() to queue #DF
  KVM: x86: Hoist nested event checks above event injection logic
  KVM: x86: Evaluate ability to inject SMI/NMI/IRQ after potential
    VM-Exit
  KVM: nVMX: Add a helper to identify low-priority #DB traps
  KVM: nVMX: Document priority of all known events on Intel CPUs
  KVM: x86: Morph pending exceptions to pending VM-Exits at queue time
  KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions
  KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle
    behavior
  KVM: x86: Rename inject_pending_events() to
    kvm_check_and_inject_events()
  KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes
  KVM: selftests: Add an x86-only test to verify nested exception
    queueing
  KVM: x86: Allow force_emulation_prefix to be written without a reload

 arch/x86/include/asm/kvm-x86-ops.h            |   2 +-
 arch/x86/include/asm/kvm_host.h               |  35 +-
 arch/x86/kvm/emulate.c                        |   3 +-
 arch/x86/kvm/svm/nested.c                     | 110 ++--
 arch/x86/kvm/svm/svm.c                        |  20 +-
 arch/x86/kvm/vmx/nested.c                     | 331 ++++++++----
 arch/x86/kvm/vmx/sgx.c                        |   2 +-
 arch/x86/kvm/vmx/vmx.c                        |  54 +-
 arch/x86/kvm/x86.c                            | 488 ++++++++++++------
 arch/x86/kvm/x86.h                            |  11 +-
 tools/testing/selftests/kvm/.gitignore        |   1 +
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/include/x86_64/svm_util.h   |   7 +-
 .../selftests/kvm/include/x86_64/vmx.h        |  51 +-
 .../kvm/x86_64/nested_exceptions_test.c       | 295 +++++++++++
 15 files changed, 987 insertions(+), 424 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/nested_exceptions_test.c


base-commit: 372d07084593dc7a399bf9bee815711b1fb1bcf2
-- 
2.37.2.672.g94769d06f0-goog




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux