Confidential VMs have a number of additional requirements on the host side which might involve interactions with userspace. One such case is with SEV-SNP guests, where the host can optionally provide a certificate table to the guest when it issues an attestation request to firmware (see GHCB 2.0 specification regarding "SNP Extended Guest Requests"). This certificate table can then be used to verify the endorsement key used by firmware to sign the attestation report. While it is possible for guests to obtain the certificates through other means, handling it via the host provides more flexibility in being able to keep the certificate data in sync with the endorsement key throughout host-side operations that might resulting in the endorsement key changing. In the case of KVM, userspace will be responsible for fetching the certificate table and keeping it in sync with any modifications to the endorsement key. Define a new KVM_EXIT_* event where userspace is provided with the GPA of the buffer the guest has provided as part of the attestation request so that userspace can write the certificate data into it. Since there is potential for additional CoCo-related events in the future, introduce this in the form of a more general KVM_EXIT_COCO exit type that handles multiple sub-types, similarly to KVM_EXIT_HYPERCALL, and then define a KVM_EXIT_COCO_REQ_CERTS sub-type to handle the actual certificate-fetching mentioned above. Also introduce a KVM_CAP_EXIT_COCO capability to enable/disable individual sub-types, similarly to KVM_CAP_EXIT_HYPERCALL. Signed-off-by: Michael Roth <michael.roth@xxxxxxx> --- Documentation/virt/kvm/api.rst | 119 ++++++++++++++++++++++++++++++++ arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 13 ++++ include/uapi/linux/kvm.h | 19 +++++ 4 files changed, 152 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 454c2aaa155e..664fba2739a9 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7173,6 +7173,107 @@ Please note that the kernel is allowed to use the kvm_run structure as the primary storage for certain register types. Therefore, the kernel may use the values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set. +:: + + /* KVM_EXIT_COCO */ + struct kvm_exit_coco { + #define KVM_EXIT_COCO_REQ_CERTS 0 + #define KVM_EXIT_COCO_MAX 1 + __u8 nr; + __u8 pad0[7]; + __u32 ret; + __u32 pad1; + union { + struct { + __u64 gfn; + __u32 npages; + } req_certs; + }; + }; + +KVM_EXIT_COCO events are intended to handle cases where a confidential +VM requires some action on the part of userspace, or cases where userspace +needs to be informed of some activity relating to a confidential VM. + +A `kvm_exit_coco` structure is defined to encapsulate the data to be sent to +or returned by userspace. The `nr` field defines the specific type of event +that needs to be serviced, and that type is used as a discriminator to +determine which union type should be used for input/output. If the exit is +successfully processed by userspace, `ret` should be set to 0 to indicate +success. A non-zero `ret` value will be treated as an error unless there is +specific handling associated with a particular error code in the per-union +type documentation. + +The parameters for each of these event/union types are documented below: + + - ``KVM_EXIT_COCO_REQ_CERTS`` + + This event provides a way to request certificate data from userspace and + have it written into guest memory. This is intended primarily to handle + attestation requests made by SEV-SNP guests (using the Extended Guest + Requests GHCB command as defined by the GHCB 2.0 specification for SEV-SNP + guests), where additional certificate data corresponding to the + endorsement key used by firmware to sign an attestation report can be + optionally provided by userspace to pass along to the guest together with + the firmware-provided attestation report. + + In the case of ``KVM_EXIT_COCO_REQ_CERTS`` events, the `req_certs` union + type is used. KVM will supply in `gfn` the non-private guest page that + userspace should use to write the contents of certificate data. In the + case of SEV-SNP, the format of this certificate data is defined in the + GHCB 2.0 specification (see section "SNP Extended Guest Request"). KVM + will also supply in `npages` the number of contiguous pages available + for writing the certificate data into. + + - If the supplied number of pages is sufficient, userspace must write + the certificate table blob (in the format defined by the GHCB spec) + into the address corresponding to `gfn` and set `ret` to 0 to indicate + success. If no certificate data is available, then userspace can + either write an empty certificate table into the address corresponding + to `gfn`, or it can disable ``KVM_EXIT_COCO_REQ_CERTS`` (via + ``KVM_CAP_EXIT_COCO``), in which case KVM will handle returning an + empty certificate table to the guest. + + - If the number of pages supplied is not sufficient, userspace must set + the required number of pages in `npages` and then set `ret` to + ``ENOSPC``. + + - If the certificate cannot be immediately provided, userspace should set + `ret` to ``EAGAIN``, which will inform the guest to retry the request + later. One scenario where this would be useful is if the certificate + is in the process of being updated and cannot be fetched until the + update completes (see the NOTE below regarding how file-locking can + be used to orchestrate such updates between management/guests). + + - If some other error occurred, userspace must set `ret` to ``EIO``. + + NOTE: In the case of SEV-SNP, the endorsement key used by firmware may + change as a result of management activities like updating SEV-SNP firmware + or loading new endorsement keys, so some care should be taken to keep the + returned certificate data in sync with the actual endorsement key in use by + firmware at the time the attestation request is sent to SNP firmware. The + recommended scheme to do this is to use file locking (e.g. via fcntl()'s + F_OFD_SETLK) in the following manner: + + - The VMM should obtain a shared/read or exclusive/write lock on the path + the certificate blob file resides at before reading it and returning it + to KVM, and continue to hold the lock until the attestation request is + actually sent to firmware. To facilitate this, the VMM can set the + ``immediate_exit`` flag of kvm_run just after supplying the certificate + data, and just before and resuming the vCPU. This will ensure the vCPU + will exit again to userspace with ``-EINTR`` after it finishes fetching + the attestation request from firmware, at which point the VMM can + safely drop the file lock. + + - Tools/libraries that perform updates to SNP firmware TCB values or + endorsement keys (e.g. via /dev/sev interfaces such as ``SNP_COMMIT``, + ``SNP_SET_CONFIG``, or ``SNP_VLEK_LOAD``, see + Documentation/virt/coco/sev-guest.rst for more details) in such a way + that the certificate blob needs to be updated, should similarly take an + exclusive lock on the certificate blob for the duration of any updates + to endorsement keys or the certificate blob contents to ensure that + VMMs using the above scheme will not return certificate blob data that + is out of sync with the endorsement key used by firmware. .. _cap_enable: @@ -9017,6 +9118,24 @@ Do not use KVM_X86_SW_PROTECTED_VM for "real" VMs, and especially not in production. The behavior and effective ABI for software-protected VMs is unstable. +8.42 KVM_CAP_EXIT_COCO +---------------------- + +:Capability: KVM_CAP_EXIT_COCO +:Architectures: x86 +:Type: vm + +This capability, if enabled, will cause KVM to exit to userspace with +KVM_EXIT_COCO exit reason to process certain events related to confidential +guests. + +Calling KVM_CHECK_EXTENSION for this capability will return a bitmask of +KVM_EXIT_COCO event types that can be configured to exit to userspace. + +The argument to KVM_ENABLE_CAP is also a bitmask, and must be a subset +of the result of KVM_CHECK_EXTENSION. KVM will forward to userspace +the event types whose corresponding bit is in the argument. + 9. Known KVM API problems ========================= diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e159e44a6a1b..1b4fb019023e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1438,6 +1438,7 @@ struct kvm_arch { struct kvm_x86_msr_filter __rcu *msr_filter; u32 hypercall_exit_enabled; + u64 coco_exit_enabled; /* Guest can access the SGX PROVISIONKEY. */ bool sgx_provisioning_allowed; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2e713480933a..c9bcc39725e0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -128,6 +128,8 @@ static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS; #define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \ KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) +#define KVM_EXIT_COCO_VALID_MASK BIT_ULL(KVM_EXIT_COCO_REQ_CERTS) + static void update_cr8_intercept(struct kvm_vcpu *vcpu); static void process_nmi(struct kvm_vcpu *vcpu); static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags); @@ -4782,6 +4784,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_READONLY_MEM: r = kvm ? kvm_arch_has_readonly_mem(kvm) : 1; break; + case KVM_CAP_EXIT_COCO: + r = KVM_EXIT_COCO_VALID_MASK; + break; default: break; } @@ -6743,6 +6748,14 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, mutex_unlock(&kvm->lock); break; } + case KVM_CAP_EXIT_COCO: + if (cap->args[0] & ~KVM_EXIT_COCO_VALID_MASK) { + r = -EINVAL; + break; + } + kvm->arch.coco_exit_enabled = cap->args[0]; + r = 0; + break; default: r = -EINVAL; break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 502ea63b5d2e..f64abda153cf 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -135,6 +135,21 @@ struct kvm_xen_exit { } u; }; +struct kvm_exit_coco { +#define KVM_EXIT_COCO_REQ_CERTS 0 +#define KVM_EXIT_COCO_MAX 1 + __u8 nr; + __u8 pad0[7]; + __u32 ret; + __u32 pad1; + union { + struct { + __u64 gfn; + __u32 npages; + } req_certs; + }; +}; + #define KVM_S390_GET_SKEYS_NONE 1 #define KVM_S390_SKEYS_MAX 1048576 @@ -178,6 +193,7 @@ struct kvm_xen_exit { #define KVM_EXIT_NOTIFY 37 #define KVM_EXIT_LOONGARCH_IOCSR 38 #define KVM_EXIT_MEMORY_FAULT 39 +#define KVM_EXIT_COCO 40 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -446,6 +462,8 @@ struct kvm_run { __u64 gpa; __u64 size; } memory_fault; + /* KVM_EXIT_COCO */ + struct kvm_exit_coco coco; /* Fix the size of the union. */ char padding[256]; }; @@ -933,6 +951,7 @@ struct kvm_enable_cap { #define KVM_CAP_PRE_FAULT_MEMORY 236 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_GUEST_MODE 238 +#define KVM_CAP_EXIT_COCO 239 struct kvm_irq_routing_irqchip { __u32 irqchip; -- 2.25.1