This patchset is also available at: https://github.com/amdese/linux/commits/snp-host-v10 and is based on top of the following series: "[PATCH RFC gmem v1 0/8] KVM: gmem hooks/changes needed for x86 (other archs?)" https://lore.kernel.org/kvm/20231016115028.996656-1-michael.roth@xxxxxxx/ which in turn is based on the KVM-x86 staging tree for guest_memfd: https://github.com/kvm-x86/linux/commits/guest_memfd == OVERVIEW == This patchset implements SEV-SNP hypervisor support for linux. It relies on the gmem changes noted above, which are still in an RFC state, but other than those aspects, the series is being targeted for inclusion in the KVM x86 tree to support running SEV-SNP guests on AMD EPYC systems utilizing Zen 3 and newer microarchitectures. More details on what SEV-SNP is and how it works are available below under "BACKGROUND". == PATCH LAYOUT == PATCH 01-02: Dependencies for patch #3 that are already upstream but not in current guest_memfd staging tree PATCH 03 : General SEV-ES fix for MSR_IA32_XSS interception that fixes a minor bug for SEV-ES, but a more severe one for SNP guests. Planning to also submit this separately as an SEV-ES fix. PATCH 04-19: Host SNP initialization code and CCP driver prep for handling SNP cmds PATCH 20-43: general SNP enablement for KVM and CCP driver PATCH 47-50: misc handling for IOMMU support, guest request handling, debug infrastructure, and kdump-related handling. == TESTING == For testing this via QEMU, use the following tree: https://github.com/amdese/qemu/commits/snp-latest-gmem-v12 SEV-SNP with gmem enabled: # set discard=none to disable discarding memory post-conversion, faster # boot times, but increased memory usage qemu-system-x86_64 -cpu EPYC-Milan-v2 \ -object memory-backend-memfd-private,id=ram1,size=2G,share=true \ -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,discard=both \ -machine q35,confidential-guest-support=sev0,memory-backend=ram1,kvm-type=protected \ ... KVM selftests for UPM: cd $kernel_src_dir make -C tools/testing/selftests TARGETS="kvm" EXTRA_CFLAGS="-DDEBUG -I<path to kernel headers>" sudo tools/testing/selftests/kvm/x86_64/private_mem_conversions_test == BACKGROUND (SEV-SNP) == This part of the Secure Encrypted Paging (SEV-SNP) series focuses on the changes required in a host OS for SEV-SNP support. The series builds upon SEV-SNP Guest Support now part of mainline. This series provides the basic building blocks to support booting the SEV-SNP VMs, it does not cover all the security enhancement introduced by the SEV-SNP such as interrupt protection. The CCP driver is enhanced to provide new APIs that use the SEV-SNP specific commands defined in the SEV-SNP firmware specification. The KVM driver uses those APIs to create and managed the SEV-SNP guests. The GHCB specification version 2 introduces new set of NAE's that is used by the SEV-SNP guest to communicate with the hypervisor. The series provides support to handle the following new NAE events: - Register GHCB GPA - Page State Change Request - Hypevisor feature - Guest message request When pages are marked as guest-owned in the RMP table, they are assigned to a specific guest/ASID, as well as a specific GFN with in the guest. Any attempts to map it in the RMP table to a different guest/ASID, or a different GFN within a guest/ASID, will result in an RMP nested page fault. Prior to accessing a guest-owned page, the guest must validate it with a special PVALIDATE instruction which will set a special bit in the RMP table for the guest. This is the only way to set the validated bit outisde of the initial pre-encrypted guest payload/image; any attempts outside the guest to modify the RMP entry from that point forward will result in the validated bit being cleared, at which point the guest will trigger an exception if it attempts to access that page so it can be made aware of possible tampering. One exception to this is the initial guest payload, which is pre-validated by the firmware prior to launching. The guest can use Guest Message requests to fetch an attestation report which will include the measurement of the initial image so that the guest can verify it was booted with the expected image/environment. After boot, guests can use Page State Change requests to switch pages between shared/hypervisor-owned and private/guest-owned to share data for things like DMA, virtio buffers, and other GHCB requests. In this implementation SEV-SNP, private guest memory is managed by a new kernel framework called guest_memfd (gmem). With gmem, a new KVM_SET_MEMORY_ATTRIBUTES KVM ioctl has been added to tell the KVM MMU whether a particular GFN should be backed by shared (normal) memory or private (gmem-allocated) memory. To tie into this, Page State Change requests are forward to userspace via KVM_EXIT_VMGEXIT exits, which will then issue the corresponding KVM_SET_MEMORY_ATTRIBUTES call to set the private/shared state in the KVM MMU. The gmem / KVM MMU hooks implemented in this series will then update the RMP table entries for the backing PFNs to set them to guest-owned/private when mapping private pages into the guest via KVM MMU, or use the normal KVM MMU handling in the case of shared pages where the corresponding RMP table entries are left in the default shared/hypervisor-owned state. Feedback/review is very much appreciated! -Mike Changes since v9: * Split off gmem changes to separate RFC series, drop RFC tag from this series * Use 2M RMPUPDATE instructions whenever possible when invalidating/releasing gmem pages * Tighten up RMP #NPF handling to better differentiate spurious cases from unexpected behavior * Simplify/optimize logic for determine when 2M NPT private mappings are possible * Be more consistent with PFN data types and stub return values (Dave) * Reduce potential flooding from frequently-printed pr_debug()'s (Dave) * Use existing #PF handling paths to catch illegal userspace-generated RMP faults (Dave) * Improve host kexec/kdump support (Ashish) * Reduce overhead from unecessary WBINVD via MMU notifiers (Ashish) * Avoid host crashes during CCP module probe if SNP_INIT* is issued while guests are running (Tom L.) * Simplify AutoIBRS disablement (Kim, Dave) * Avoid unecessary zero'ing in extended guest requests (Alexey) * Fix padding in struct sev_user_data_ext_snp_config (Alexey) * Report AP creation failures via GHCB error codes rather than inducing #GP in guest (Peter) * Disallow multiple allocations of snp_context via userspace (Peter) * Error out on unsupported SNP policy bits (Tom) * Fix snp_leak_pages() stub (Jeremi) * Use C99 flexible arrays where appropriate * Use helper to handle HVA->PFN conversions prior to dumping RMP entries (Dave) * Don't potentially print out all 512 entries when dumping 2MB RMP range (Dave) * Don't use a union to dump raw RMP entries, just cast at dump-site (Dave) * Don't use helpers to access RMP entry bitfields, use them directly (Dave) * Simplify logic and improve comments for AutoIBRS disablement (Dave) # Changes that were split off to separate gmem series * Use KVM_X86_SNP_VM to implement SNP-specific checks on whether a fault was shared/private and drop the duplicate memslot lookup (Isaku, Sean) * Use Isaku's version of patch to plumb 64-bit #NPF error code (Isaku) * Fix up stub for kvm_arch_gmem_invalidate() (Boris) Changes since v8: * Rework gmem/UPM hooks based on Sean's latest gmem/UPM tree * Move SEV lazy-pinning support out to a separate series which uses this series as a prereq instead of the other way around. * Re-organize extended guest request patches into 3 patches encompassing SEV FD ioctls for host-wide certs, KVM ioctls for per-instance certs, and the guest request handling that consumes them. Also move them to the top of the series to better separate them for the core SNP patches (Alexey, Zhi, Ashish, Dov, Dionna, others) * Various other changes/fixups for extended guests request handling (Dov, Alexey, Dionna) * Use helper to calculate max RMP entry size and improve readability (Dave) * Use architecture-independent GPA value for initial VMSA pages * Ensure SEV_CMD_SNP_GUEST_REQUEST failures are indicated to guest (Alex) * Allocate per-instance certs on-demand (Alex) * comment fixup for RMP fault handling (Zhi) * commit msg rewording for MSR-based PSCs (Zhi) * update SNP command/struct definitions based on 1.54 ABI (Saban) * use sev_deactivate_lock around SEV_CMD_SNP_DECOMMISSION (Saban) * Various comment/commit fixups (Zhi, Alex, Kim, Vlastimil, Dave, * kexec fixes for newer SNP firmwares (Ashish) * Various other fixups and re-ordering of patches. ---------------------------------------------------------------- Ashish Kalra (4): x86/sev: Introduce snp leaked pages list KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP iommu/amd: Add IOMMU_SNP_SHUTDOWN support crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump Brijesh Singh (29): x86/cpufeatures: Add SEV-SNP CPU feature x86/sev: Add the host SEV-SNP initialization support x86/sev: Add RMP entry lookup helpers x86/fault: Add helper for dumping RMP entries x86/traps: Define RMP violation #PF error code x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction x86/sev: Invalidate pages from the direct map when adding them to the RMP table crypto: ccp: Define the SEV-SNP commands crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP crypto: ccp: Provide API to issue SEV and SNP commands crypto: ccp: Handle the legacy TMR allocation when SNP is enabled crypto: ccp: Handle the legacy SEV command when SNP is enabled crypto: ccp: Add the SNP_PLATFORM_STATUS command KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe KVM: SEV: Add initial SEV-SNP support KVM: SEV: Add KVM_SNP_INIT command KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command KVM: SEV: Add support to handle GHCB GPA register VMGEXIT KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT KVM: SEV: Add support to handle Page State Change VMGEXIT KVM: x86: Export the kvm_zap_gfn_range() for the SNP use KVM: SEV: Add support to handle RMP nested page faults KVM: SVM: Add module parameter to enable the SEV-SNP crypto: ccp: Add the SNP_{SET,GET}_EXT_CONFIG command KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event crypto: ccp: Add debug support for decrypting pages Dionna Glaze (1): x86/sev: Add KVM commands for per-instance certs Kim Phillips (1): x86/speculation: Do not enable Automatic IBRS if SEV SNP is enabled Michael Roth (9): KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests x86/fault: Report RMP page faults for kernel addresses KVM: SEV: Select CONFIG_KVM_SW_PROTECTED_VM when CONFIG_KVM_AMD_SEV=y KVM: SEV: Add KVM_EXIT_VMGEXIT KVM: SEV: Add support for GHCB-based termination requests KVM: SEV: Implement gmem hook for initializing private pages KVM: SEV: Implement gmem hook for invalidating private pages KVM: x86: Add gmem hook for determining max NPT mapping level iommu/amd: Report all cases inhibiting SNP enablement Paolo Bonzini (1): KVM: SVM: INTERCEPT_RDTSCP is never intercepted anyway Tom Lendacky (4): KVM: SVM: Fix TSC_AUX virtualization setup KVM: SEV: Add support to handle AP reset MSR protocol KVM: SEV: Use a VMSA physical address variable for populating VMCB KVM: SEV: Support SEV-SNP AP Creation NAE event Vishal Annapurve (1): KVM: Add HVA range operator Documentation/virt/coco/sev-guest.rst | 54 + Documentation/virt/kvm/api.rst | 34 + .../virt/kvm/x86/amd-memory-encryption.rst | 147 ++ arch/x86/Kbuild | 2 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/disabled-features.h | 8 +- arch/x86/include/asm/kvm-x86-ops.h | 2 + arch/x86/include/asm/kvm_host.h | 5 + arch/x86/include/asm/msr-index.h | 11 +- arch/x86/include/asm/sev-common.h | 33 + arch/x86/include/asm/sev-host.h | 37 + arch/x86/include/asm/sev.h | 6 + arch/x86/include/asm/svm.h | 6 + arch/x86/include/asm/trap_pf.h | 4 + arch/x86/kernel/cpu/amd.c | 24 +- arch/x86/kernel/cpu/common.c | 7 +- arch/x86/kernel/crash.c | 7 + arch/x86/kvm/Kconfig | 3 + arch/x86/kvm/lapic.c | 5 +- arch/x86/kvm/mmu.h | 2 - arch/x86/kvm/mmu/mmu.c | 13 +- arch/x86/kvm/svm/nested.c | 2 +- arch/x86/kvm/svm/sev.c | 1903 +++++++++++++++++--- arch/x86/kvm/svm/svm.c | 64 +- arch/x86/kvm/svm/svm.h | 41 +- arch/x86/kvm/x86.c | 11 + arch/x86/mm/fault.c | 5 + arch/x86/virt/svm/Makefile | 3 + arch/x86/virt/svm/sev.c | 548 ++++++ drivers/crypto/ccp/sev-dev.c | 1253 ++++++++++++- drivers/crypto/ccp/sev-dev.h | 16 + drivers/iommu/amd/init.c | 65 +- include/linux/amd-iommu.h | 5 +- include/linux/kvm_host.h | 6 + include/linux/psp-sev.h | 304 +++- include/uapi/linux/kvm.h | 74 + include/uapi/linux/psp-sev.h | 71 + tools/arch/x86/include/asm/cpufeatures.h | 1 + virt/kvm/kvm_main.c | 49 + 39 files changed, 4497 insertions(+), 335 deletions(-) create mode 100644 arch/x86/include/asm/sev-host.h create mode 100644 arch/x86/virt/svm/Makefile create mode 100644 arch/x86/virt/svm/sev.c