On Tue, May 07, 2024 at 08:04:50PM +0200, Paolo Bonzini wrote: > On Wed, May 1, 2024 at 11:03 AM Michael Roth <michael.roth@xxxxxxx> wrote: > > > > This patchset is also available at: > > > > https://github.com/amdese/linux/commits/snp-host-v15 > > > > and is based on top of the series: > > > > "Add SEV-ES hypervisor support for GHCB protocol version 2" > > https://lore.kernel.org/kvm/20240501071048.2208265-1-michael.roth@xxxxxxx/ > > https://github.com/amdese/linux/commits/sev-init2-ghcb-v1 > > > > which in turn is based on commit 20cc50a0410f (just before v14 SNP patches): > > > > https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=kvm-coco-queue > > I have mostly reviewed this, with the exception of the > snp_begin/complete_psc parts. Thanks Paolo. We actually recently uncovered some issues with snp_begin/complete_psc using some internal kvm-unit-tests that exercise some edge cases, so I would hold off on reviewing that. Will send a fix-up patch today after a bit more testing. > > I am not sure about removing all the pr_debug() - I am sure it will be > a bit more painful for userspace developers to figure out what exactly > has gone wrong in some cases. But we can add them later, if needed - > I'm certainly not going to make a fuss about it. Yah, they do tend to be useful for that purpose. I think if we do add them back we can consolidate the information a little better versus what I had previously. -Mike > > Paolo > > > > Patch Layout > > ------------ > > > > 01-02: These patches revert+replace the existing .gmem_validate_fault hook > > with a similar .private_max_mapping_level as suggested by Sean[1] > > > > 03-04: These patches add some basic infrastructure and introduces a new > > KVM_X86_SNP_VM vm_type to handle differences verses the existing > > KVM_X86_SEV_VM and KVM_X86_SEV_ES_VM types. > > > > 05-07: These implement the KVM API to handle the creation of a > > cryptographic launch context, encrypt/measure the initial image > > into guest memory, and finalize it before launching it. > > > > 08-12: These implement handling for various guest-generated events such > > as page state changes, onlining of additional vCPUs, etc. > > > > 13-16: These implement the gmem/mmu hooks needed to prepare gmem-allocated > > pages before mapping them into guest private memory ranges as > > well as cleaning them up prior to returning them to the host for > > use as normal memory. Because this supplants certain activities > > like issued WBINVDs during KVM MMU invalidations, there's also > > a patch to avoid duplicating that work to avoid unecessary > > overhead. > > > > 17: With all the core support in place, the patch adds a kvm_amd module > > parameter to enable SNP support. > > > > 18-20: These patches all deal with the servicing of guest requests to handle > > things like attestation, as well as some related host-management > > interfaces. > > > > [1] https://lore.kernel.org/kvm/ZimnngU7hn7sKoSc@xxxxxxxxxx/#t > > > > > > Testing > > ------- > > > > For testing this via QEMU, use the following tree: > > > > https://github.com/amdese/qemu/commits/snp-v4-wip3c > > > > A patched OVMF is also needed due to upstream KVM no longer supporting MMIO > > ranges that are mapped as private. It is recommended you build the AmdSevX64 > > variant as it provides the kernel-hashing support present in this series: > > > > https://github.com/amdese/ovmf/commits/apic-mmio-fix1d > > > > A basic command-line invocation for SNP would be: > > > > qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2 > > -machine q35,confidential-guest-support=sev0,memory-backend=ram1 > > -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false > > -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth= > > -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd > > > > With kernel-hashing and certificate data supplied: > > > > qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2 > > -machine q35,confidential-guest-support=sev0,memory-backend=ram1 > > -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false > > -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth=,certs-path=/home/mroth/cert.blob,kernel-hashes=on > > -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d-AmdSevX64.fd > > -kernel /boot/vmlinuz-$ver > > -initrd /boot/initrd.img-$ver > > -append "root=UUID=d72a6d1c-06cf-4b79-af43-f1bac4f620f9 ro console=ttyS0,115200n8" > > > > With standard X64 OVMF package with separate image for persistent NVRAM: > > > > qemu-system-x86_64 -smp 32,maxcpus=255 -cpu EPYC-Milan-v2 > > -machine q35,confidential-guest-support=sev0,memory-backend=ram1 > > -object memory-backend-memfd,id=ram1,size=4G,share=true,reserve=false > > -object sev-snp-guest,id=sev0,cbitpos=51,reduced-phys-bits=1,id-auth= > > -bios OVMF_CODE-upstream-20240410-apic-mmio-fix1d.fd > > -drive if=pflash,format=raw,unit=0,file=OVMF_VARS-upstream-20240410-apic-mmio-fix1d.fd,readonly=off > > > > > > Known issues / TODOs > > -------------------- > > > > * Base tree in some cases reports "Unpatched return thunk in use. This should > > not happen!" the first time it runs an SVM/SEV/SNP guests. This a recent > > regression upstream and unrelated to this series: > > > > https://lore.kernel.org/linux-kernel/CANpmjNOcKzEvLHoGGeL-boWDHJobwfwyVxUqMq2kWeka3N4tXA@xxxxxxxxxxxxxx/T/ > > > > * 2MB hugepage support has been dropped pending discussion on how we plan to > > re-enable it in gmem. > > > > * Host kexec should work, but there is a known issue with host kdump support > > while SNP guests are running that will be addressed as a follow-up. > > > > * SNP kselftests are currently a WIP and will be included as part of SNP > > upstreaming efforts in the near-term. > > > > > > SEV-SNP Overview > > ---------------- > > > > This part of the Secure Encrypted Paging (SEV-SNP) series focuses on the > > changes required to add KVM support for SEV-SNP. This series builds upon > > SEV-SNP guest support, which is now in mainline, and and SEV-SNP host > > initialization support, which is now in linux-next. > > > > While series provides the basic building blocks to support booting the > > SEV-SNP VMs, it does not cover all the security enhancement introduced by > > the SEV-SNP such as interrupt protection, which will added in the future. > > > > With SNP, when pages are marked as guest-owned in the RMP table, they are > > assigned to a specific guest/ASID, as well as a specific GFN with in the > > guest. Any attempts to map it in the RMP table to a different guest/ASID, > > or a different GFN within a guest/ASID, will result in an RMP nested page > > fault. > > > > Prior to accessing a guest-owned page, the guest must validate it with a > > special PVALIDATE instruction which will set a special bit in the RMP table > > for the guest. This is the only way to set the validated bit outside of the > > initial pre-encrypted guest payload/image; any attempts outside the guest to > > modify the RMP entry from that point forward will result in the validated > > bit being cleared, at which point the guest will trigger an exception if it > > attempts to access that page so it can be made aware of possible tampering. > > > > One exception to this is the initial guest payload, which is pre-validated > > by the firmware prior to launching. The guest can use Guest Message requests > > to fetch an attestation report which will include the measurement of the > > initial image so that the guest can verify it was booted with the expected > > image/environment. > > > > After boot, guests can use Page State Change requests to switch pages > > between shared/hypervisor-owned and private/guest-owned to share data for > > things like DMA, virtio buffers, and other GHCB requests. > > > > In this implementation of SEV-SNP, private guest memory is managed by a new > > kernel framework called guest_memfd (gmem). With gmem, a new > > KVM_SET_MEMORY_ATTRIBUTES KVM ioctl has been added to tell the KVM > > MMU whether a particular GFN should be backed by shared (normal) memory or > > private (gmem-allocated) memory. To tie into this, Page State Change > > requests are forward to userspace via KVM_EXIT_VMGEXIT exits, which will > > then issue the corresponding KVM_SET_MEMORY_ATTRIBUTES call to set the > > private/shared state in the KVM MMU. > > > > The gmem / KVM MMU hooks implemented in this series will then update the RMP > > table entries for the backing PFNs to set them to guest-owned/private when > > mapping private pages into the guest via KVM MMU, or use the normal KVM MMU > > handling in the case of shared pages where the corresponding RMP table > > entries are left in the default shared/hypervisor-owned state. > > > > Feedback/review is very much appreciated! > > > > -Mike > > > > > > Changes since v14: > > > > * switch to vendor-agnostic KVM_HC_MAP_GPA_RANGE exit for forwarding > > page-state change requests to userspace instead of an SNP-specific exit > > (Sean) > > * drop SNP_PAUSE_ATTESTATION/SNP_RESUME_ATTESTATION interfaces, instead > > add handling in KVM_EXIT_VMGEXIT so that VMMs can implement their own > > mechanisms for keeping userspace-supplied certificates in-sync with > > firmware's TCB/endorsement key (Sean) > > * carve out SEV-ES-specific handling for GHCB protocol 2, add control of > > the protocol version, and post as a separate prereq patchset (Sean) > > * use more consistent error-handling in snp_launch_{start,update,finish}, > > simplify logic based on review comments (Sean) > > * rename .gmem_validate_fault to .private_max_mapping_level and rework > > logic based on review suggestions (Sean) > > * reduce number of pr_debug()'s in series, avoid multiple WARN's in > > succession (Sean) > > * improve documentation and comments throughout > > > > Changes since v13: > > > > * rebase to new kvm-coco-queue and wire up to PFERR_PRIVATE_ACCESS (Paolo) > > * handle setting kvm->arch.has_private_mem in same location as > > kvm->arch.has_protected_state (Paolo) > > * add flags and additional padding fields to > > snp_launch{start,update,finish} APIs to address alignment and > > expandability (Paolo) > > * update snp_launch_update() to update input struct values to reflect > > current progress of command in situations where mulitple calls are > > needed (Paolo) > > * update snp_launch_update() to avoid copying/accessing 'src' parameter > > when dealing with zero pages. (Paolo) > > * update snp_launch_update() to use u64 as length input parameter instead > > of u32 and adjust padding accordingly > > * modify ordering of SNP_POLICY_MASK_* definitions to be consistent with > > bit order of corresponding flags > > * let firmware handle enforcement of policy bits corresponding to > > user-specified minimum API version > > * add missing "0x" prefixs in pr_debug()'s for snp_launch_start() > > * fix handling of VMSAs during in-place migration (Paolo) > > > > Changes since v12: > > > > * rebased to latest kvm-coco-queue branch (commit 4d2deb62185f) > > * add more input validation for SNP_LAUNCH_START, especially for handling > > things like MBO/MBZ policy bits, and API major/minor minimums. (Paolo) > > * block SNP KVM instances from being able to run legacy SEV commands (Paolo) > > * don't attempt to measure VMSA for vcpu 0/BSP before the others, let > > userspace deal with the ordering just like with SEV-ES (Paolo) > > * fix up docs for SNP_LAUNCH_FINISH (Paolo) > > * introduce svm->sev_es.snp_has_guest_vmsa flag to better distinguish > > handling for guest-mapped vs non-guest-mapped VMSAs, rename > > 'snp_ap_create' flag to 'snp_ap_waiting_for_reset' (Paolo) > > * drop "KVM: SEV: Use a VMSA physical address variable for populating VMCB" > > as it is no longer needed due to above VMSA rework > > * replace pr_debug_ratelimited() messages for RMP #NPFs with a single trace > > event > > * handle transient PSMASH_FAIL_INUSE return codes in kvm_gmem_invalidate(), > > switch to WARN_ON*()'s to indicate remaining error cases are not expected > > and should not be seen in practice. (Paolo) > > * add a cond_resched() in kvm_gmem_invalidate() to avoid soft lock-ups when > > cleaning up large guest memory ranges. > > * rename VLEK_REQUIRED to VCEK_DISABLE. it's be more applicable if another > > key type ever gets added. > > * don't allow attestation to be paused while an attestation request is > > being processed by firmware (Tom) > > * add missing Documentation entry for SNP_VLEK_LOAD > > * collect Reviewed-by's from Paolo and Tom > > > > > > ---------------------------------------------------------------- > > Ashish Kalra (1): > > KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP > > > > Brijesh Singh (8): > > KVM: SEV: Add initial SEV-SNP support > > KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command > > KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command > > KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command > > KVM: SEV: Add support to handle GHCB GPA register VMGEXIT > > KVM: SEV: Add support to handle RMP nested page faults > > KVM: SVM: Add module parameter to enable SEV-SNP > > KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event > > > > Michael Roth (10): > > Revert "KVM: x86: Add gmem hook for determining max NPT mapping level" > > KVM: x86: Add hook for determining max NPT mapping level > > KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y > > KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT > > KVM: SEV: Add support to handle Page State Change VMGEXIT > > KVM: SEV: Implement gmem hook for initializing private pages > > KVM: SEV: Implement gmem hook for invalidating private pages > > KVM: x86: Implement hook for determining max NPT mapping level > > KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event > > crypto: ccp: Add the SNP_VLEK_LOAD command > > > > Tom Lendacky (1): > > KVM: SEV: Support SEV-SNP AP Creation NAE event > > > > Documentation/virt/coco/sev-guest.rst | 19 + > > Documentation/virt/kvm/api.rst | 87 ++ > > .../virt/kvm/x86/amd-memory-encryption.rst | 110 +- > > arch/x86/include/asm/kvm-x86-ops.h | 2 +- > > arch/x86/include/asm/kvm_host.h | 5 +- > > arch/x86/include/asm/sev-common.h | 25 + > > arch/x86/include/asm/sev.h | 3 + > > arch/x86/include/asm/svm.h | 9 +- > > arch/x86/include/uapi/asm/kvm.h | 48 + > > arch/x86/kvm/Kconfig | 3 + > > arch/x86/kvm/mmu.h | 2 - > > arch/x86/kvm/mmu/mmu.c | 27 +- > > arch/x86/kvm/svm/sev.c | 1538 +++++++++++++++++++- > > arch/x86/kvm/svm/svm.c | 44 +- > > arch/x86/kvm/svm/svm.h | 52 + > > arch/x86/kvm/trace.h | 31 + > > arch/x86/kvm/x86.c | 17 + > > drivers/crypto/ccp/sev-dev.c | 36 + > > include/linux/psp-sev.h | 4 +- > > include/uapi/linux/kvm.h | 23 + > > include/uapi/linux/psp-sev.h | 27 + > > include/uapi/linux/sev-guest.h | 9 + > > virt/kvm/guest_memfd.c | 4 +- > > 23 files changed, 2081 insertions(+), 44 deletions(-) > > > >