[PATCH 0/5] Introduce a quirk to control memslot zap behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Today "zapping only memslot leaf SPTEs" on moving/deleting a memslot is not
done. Instead, KVM opts to invalidate all page tables and generate fresh
new ones based on the new memslot layout (referred to as "zap all" for
short). This "zap all" behavior is of low overhead for most use cases, and
is adopted primarily due to a bug which caused VM instability when a VM is
with Nvidia Geforce GPU assigned (see link in patch 1).

However, the "zap all" behavior is not desired for certain specific
scenarios. e.g.
- It's not viable for TDX,
  a) TDX requires root page of private page table remains unaltered
     throughout the TD life cycle.
  b) TDX mandates that leaf entries in private page table must be zapped
     prior to non-leaf entries.
  c) TDX requires re-accepting of private pages after page dropping.
- It's not performant for scenarios involving frequent deletion and
  re-adding of numerous small memslots.

This series therefore introduces the KVM_X86_QUIRK_SLOT_ZAP_ALL quirk,
enabling users to control the behavior of memslot zapping when a memslot is
moved/deleted.

The quirk is turned on by default, leading to invalidation/zapping to all
SPTEs when a memslot is moved/deleted.

Users have the option to turn off the quirk. Doing so will limit the
zapping to only leaf SPTEs within the range of memslot being moved/deleted.

This series has been tested with
- Normal VMs
  w/ and w/o device assignment, and kvm selftests

- TDX guests.
  Memslot deletion typically does not occur without device assignment for a
  TD. Therefore, it is tested with shared device assignment.

Note: For TDX integration, the quirk is currently disabled via TDX code in
      QEMU rather than being automatically disabled based on VM type in
      KVM, which is not safe. A malfunctioning QEMU that fails to disable
      the quirk could result in the shared EPT being invalidated while the
      private EPT remains unaffected, as kvm_mmu_zap_all_fast() only
      targets the shared EPT.      

      However, current kvm->arch.disabled_quirks is entirely
      user-controlled, and there is no mechanism for users to verify if a
      quirk has been disabled by the kernel.
      We are therefore wondering which below options are better for TDX:

      a) Add a condition for TDX VM type in kvm_arch_flush_shadow_memslot()
         besides the testing of kvm_check_has_quirk(). It is similar to
         "all new VM types have the quirk disabled". e.g.

         static inline bool kvm_memslot_flush_zap_all(struct kvm *kvm)                    
         {                                                                                
              return kvm->arch.vm_type != KVM_X86_TDX_VM &&                               
                     kvm_check_has_quirk(kvm, KVM_X86_QUIRK_SLOT_ZAP_ALL);                
         }
         
      b) Init the disabled_quirks based on VM type in kernel, extend
         disabled_quirk querying/setting interface to enforce the quirk to
         be disabled for TDX.

Patch 1:   KVM changes.
Patch 2-5: Selftests updates. Verify memslot move/deletion functionality
           with the quirk enabled/disabled.


Yan Zhao (5):
  KVM: x86/mmu: Introduce a quirk to control memslot zap behavior
  KVM: selftests: Test slot move/delete with slot zap quirk
    enabled/disabled
  KVM: selftests: Allow slot modification stress test with quirk
    disabled
  KVM: selftests: Test memslot move in memslot_perf_test with quirk
    disabled
  KVM: selftests: Test private access to deleted memslot with quirk
    disabled

 Documentation/virt/kvm/api.rst                |  6 ++++
 arch/x86/include/asm/kvm_host.h               |  3 +-
 arch/x86/include/uapi/asm/kvm.h               |  1 +
 arch/x86/kvm/mmu/mmu.c                        | 36 ++++++++++++++++++-
 .../kvm/memslot_modification_stress_test.c    | 19 ++++++++--
 .../testing/selftests/kvm/memslot_perf_test.c | 12 ++++++-
 .../selftests/kvm/set_memory_region_test.c    | 29 ++++++++++-----
 .../kvm/x86_64/private_mem_kvm_exits_test.c   | 11 ++++--
 8 files changed, 102 insertions(+), 15 deletions(-)

base-commit: dd5a440a31fae6e459c0d6271dddd62825505361
-- 
2.43.2




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux