EPT-Based Sub-Page write Protection(SPP) allows Virtual Machine Monitor(VMM) specify write-permission for guest physical memory at a sub-page(128 byte) granularity. When SPP works, HW enforces write-access check for sub-pages within a protected 4KB page. The feature targets to provide fine-grained memory protection for usages such as memory guard and VM introspection etc. SPP is active when the "sub-page write protection" (bit 23) is set in Secondary VM-Execution Controls. The feature is backed with a Sub-Page Permission Table(SPPT). The subpage permission vector is stored in the leaf entry of SPPT. The root page is referenced via a Sub-Page Permission Table Pointer (SPPTP) in VMCS. To enable SPP for guest memory, the guest page should be first mapped in a 4KB EPT entry, then set SPP bit 61 of the corresponding entry. While HW walks EPT, it traverses SPPT with the gpa to look up the sub-page permission vector within SPPT leaf entry. If the corresponding bit is set, write to sub-page is permitted, otherwise, SPP induced EPT violation is generated. This patch serial passed SPP function test and selftest on Ice-Lake platform. Please refer to the SPP introduction document in this patch set and Intel SDM for details: Intel SDM: https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf Patch 1: Documentation for SPP and related API. Patch 2: Put MMU/SSP shared definitions to a new mmu_internal.h file. Patch 3: SPPT setup functions Patch 4: Functions to {get|set}_subpage permission Patch 5: Introduce user-space SPP IOCTLs Patch 6: Handle SPP induced vmexit and EPT violation Patch 7: Enable Lazy mode SPP protection Patch 8: Re-enable SPP if EPT mapping changes Patch 9: Enable SPP in instruction emulation Patch 10: Initialize SPP related data structures. Patch 11: selftest for SPP. Change logs: v12: 1. Put MMU/SPP shared definitions/prototypes into a new mmu_internal.h per maintainers' comments. 2. Changed fast_page_fault()'s return type from bool to RET_PF_* per Paolo's suggestion. 3. Re-allocate SPPT root page if the root is allocated in an early spp_init() but is freed accompanied with EPT root page release. The issue is reported by Stefan Sicleru <ssicleru@xxxxxxxxxxxxxxx>. 4. Other refactor per above changes. 5. Rebased patches to 5.7-rc5. 6. Fixed a virtual address mapping issue of selftest. v11: 1. Refactored patches Per Sean's review feedback. 2. Added HW/KVM capabilities check before initializes SPP. 3. Combined a few functions having similar usages. 4. Removed unecessary functions in kvm_x86_ops. 5. Other code fix according to testing. v10: 1. Cleared SPP active flag on VM resetting. 2. Added trancepoints on subpage setup and SPP induced vmexits. 3. Fixed a few code issues reported by Intel test robot. v9: 1. Added SPP protection check in pte prefetch case. 2. Flushed EPT rmap to remove existing mappings of the target gfns. 3. Modified documentation to reflect recent changes. 4. Other minor code refactor. v8: 1. Changed ioctl interface definition per Paolo's comments. 2. Replaced SPP_INIT ioctl funciton with KVM_ENABLE_CAP. 3. Removed SPP bit from X86 feature word. 4. Returned instruction length to user-space when SPP induced EPT violation happens, this is to provide flexibility to use SPP, revert write or track write. 5. Modified selftest application and added into this serial. 6. Simplified SPP permission vector check. 7. Moved spp.c and spp.h to kvm/mmu folder. 8. Other code fix according to Paolo's feedback and testing. v7: 1. Configured all available protected pages once SPP induced vmexit happens since there's no PRESENT bit in SPPT leaf entry. 2. Changed SPP protection check flow in tdp_page_fault(). 3. Code refactor and minior fixes. v6: 1. Added SPP protection patch for emulation cases per Jim's review. 2. Modified documentation and added API description per Jim's review. 3. Other minior changes suggested by Jim. v5: 1. Enable SPP support for Hugepage(1GB/2MB) to extend application. 2. Make SPP miss vm-exit handler as the unified place to set up SPPT. 3. If SPP protected pages are access-tracked or dirty-page-tracked, store SPP flag in reserved address bit, restore it in fast_page_fault() handler. 4. Move SPP specific functions to vmx/spp.c and vmx/spp.h 5. Rebased code to kernel v5.3 6. Other change suggested by KVM community. v4: 1. Modified documentation to make it consistent with patches. 2. Allocated SPPT root page in init_spp() instead of vmx_set_cr3() to avoid SPPT miss error. 3. Added back co-developers and sign-offs. v3: 1. Rebased patches to kernel 5.1 release 2. Deferred SPPT setup to EPT fault handler if the page is not available while set_subpage() is being called. 3. Added init IOCTL to reduce extra cost if SPP is not used. 4. Refactored patch structure, cleaned up cross referenced functions. 5. Added code to deal with memory swapping/migration/shrinker cases. v1: 1. Rebased to 4.20-rc1 2. Move VMCS change to a separated patch. 3. Code refine and Bug fix Yang Weijiang (11): Documentation: Add EPT based Subpage Protection and related APIs mmu: spp: Add a new header file to put definitions shared by MMU and SPP mmu: spp: Implement SPPT setup functions mmu: spp: Implement functions to {get|set}_subpage permission x86: spp: Introduce user-space SPP IOCTLs vmx: spp: Handle SPP induced vmexit and EPT violation mmu: spp: Enable Lazy mode SPP protection mmu: spp: Re-enable SPP protection when EPT mapping changes x86: spp: Add SPP protection check in instruction emulation vmx: spp: Initialize SPP bitmap and SPP protection kvm: selftests: selftest for Sub-Page protection Documentation/virt/kvm/api.rst | 38 ++ Documentation/virtual/kvm/spp_kvm.txt | 179 +++++ arch/x86/include/asm/kvm_host.h | 11 +- arch/x86/include/asm/vmx.h | 10 + arch/x86/include/asm/vmxfeatures.h | 1 + arch/x86/include/uapi/asm/vmx.h | 2 + arch/x86/kvm/Makefile | 2 +- arch/x86/kvm/mmu.h | 9 +- arch/x86/kvm/mmu/mmu.c | 287 ++++---- arch/x86/kvm/mmu/spp.c | 621 ++++++++++++++++++ arch/x86/kvm/mmu/spp.h | 39 ++ arch/x86/kvm/mmu_internal.h | 147 +++++ arch/x86/kvm/mmutrace.h | 10 +- arch/x86/kvm/trace.h | 66 ++ arch/x86/kvm/vmx/capabilities.h | 5 + arch/x86/kvm/vmx/vmx.c | 110 ++++ arch/x86/kvm/x86.c | 135 ++++ include/uapi/linux/kvm.h | 17 + tools/testing/selftests/kvm/Makefile | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 1 + tools/testing/selftests/kvm/x86_64/spp_test.c | 235 +++++++ 21 files changed, 1793 insertions(+), 133 deletions(-) create mode 100644 Documentation/virtual/kvm/spp_kvm.txt create mode 100644 arch/x86/kvm/mmu/spp.c create mode 100644 arch/x86/kvm/mmu/spp.h create mode 100644 arch/x86/kvm/mmu_internal.h create mode 100644 tools/testing/selftests/kvm/x86_64/spp_test.c -- 2.17.2