On Wed, Feb 24, 2021 at 08:59:15AM +0000, Nathan Tempelman wrote: > Add a capability for userspace to mirror SEV encryption context from > one vm to another. On our side, this is intended to support a > Migration Helper vCPU, but it can also be used generically to support > other in-guest workloads scheduled by the host. The intention is for > the primary guest and the mirror to have nearly identical memslots. > > The primary benefits of this are that: > 1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they > can't accidentally clobber each other. > 2) The VMs can have different memory-views, which is necessary for post-copy > migration (the migration vCPUs on the target need to read and write to > pages, when the primary guest would VMEXIT). > > This does not change the threat model for AMD SEV. Any memory involved > is still owned by the primary guest and its initial state is still > attested to through the normal SEV_LAUNCH_* flows. If userspace wanted > to circumvent SEV, they could achieve the same effect by simply attaching > a vCPU to the primary VM. > This patch deliberately leaves userspace in charge of the memslots for the > mirror, as it already has the power to mess with them in the primary guest. > > This patch does not support SEV-ES (much less SNP), as it does not > handle handing off attested VMSAs to the mirror. > > For additional context, we need a Migration Helper because SEV PSP migration > is far too slow for our live migration on its own. Using an in-guest > migrator lets us speed this up significantly. > > Signed-off-by: Nathan Tempelman <natet@xxxxxxxxxx> > --- > Documentation/virt/kvm/api.rst | 17 +++++++ > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/svm/sev.c | 82 +++++++++++++++++++++++++++++++++ > arch/x86/kvm/svm/svm.c | 2 + > arch/x86/kvm/svm/svm.h | 2 + > arch/x86/kvm/x86.c | 7 ++- > include/linux/kvm_host.h | 1 + > include/uapi/linux/kvm.h | 1 + > virt/kvm/kvm_main.c | 8 ++++ > 9 files changed, 120 insertions(+), 1 deletion(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 482508ec7cc4..438b647663c9 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -6213,6 +6213,23 @@ the bus lock vm exit can be preempted by a higher priority VM exit, the exit > notifications to userspace can be KVM_EXIT_BUS_LOCK or other reasons. > KVM_RUN_BUS_LOCK flag is used to distinguish between them. > > +7.23 KVM_CAP_VM_COPY_ENC_CONTEXT_TO > +----------------------------------- > + > +Architectures: x86 SEV enabled > +Type: system > +Parameters: args[0] is the fd of the kvm to mirror encryption context to > +Returns: 0 on success; ENOTTY on error > + > +This capability enables userspace to copy encryption context from a primary > +vm to the vm indicated by the fd. > + > +This is intended to support in-guest workloads scheduled by the host. This > +allows the in-guest workload to maintain its own NPTs and keeps the two vms > +from accidentally clobbering each other with interrupts and the like (separate > +APIC/MSRs/etc). > + > + > 8. Other capabilities. > ====================== > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 84499aad01a4..b7636c009647 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1334,6 +1334,7 @@ struct kvm_x86_ops { > int (*mem_enc_op)(struct kvm *kvm, void __user *argp); > int (*mem_enc_reg_region)(struct kvm *kvm, struct kvm_enc_region *argp); > int (*mem_enc_unreg_region)(struct kvm *kvm, struct kvm_enc_region *argp); > + int (*vm_copy_enc_context_to)(struct kvm *kvm, unsigned int child_fd); > > int (*get_msr_feature)(struct kvm_msr_entry *entry); > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index 874ea309279f..2bad6cd2cb4c 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -66,6 +66,11 @@ static int sev_flush_asids(void) > return ret; > } > > +static inline bool is_mirroring_enc_context(struct kvm *kvm) > +{ > + return &to_kvm_svm(kvm)->sev_info.enc_context_owner; > +} > + > /* Must be called with the sev_bitmap_lock held */ > static bool __sev_recycle_asids(int min_asid, int max_asid) > { > @@ -1124,6 +1129,10 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp) > if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd))) > return -EFAULT; > > + /* enc_context_owner handles all memory enc operations */ > + if (is_mirroring_enc_context(kvm)) > + return -ENOTTY; > + > mutex_lock(&kvm->lock); > > switch (sev_cmd.id) { > @@ -1186,6 +1195,10 @@ int svm_register_enc_region(struct kvm *kvm, > if (!sev_guest(kvm)) > return -ENOTTY; > > + /* If kvm is mirroring encryption context it isn't responsible for it */ > + if (is_mirroring_enc_context(kvm)) > + return -ENOTTY; > + > if (range->addr > ULONG_MAX || range->size > ULONG_MAX) > return -EINVAL; > > @@ -1252,6 +1265,10 @@ int svm_unregister_enc_region(struct kvm *kvm, > struct enc_region *region; > int ret; > > + /* If kvm is mirroring encryption context it isn't responsible for it */ > + if (is_mirroring_enc_context(kvm)) > + return -ENOTTY; > + > mutex_lock(&kvm->lock); > > if (!sev_guest(kvm)) { > @@ -1282,6 +1299,65 @@ int svm_unregister_enc_region(struct kvm *kvm, > return ret; > } > > +int svm_vm_copy_asid_to(struct kvm *kvm, unsigned int mirror_kvm_fd) > +{ > + struct file *mirror_kvm_file; > + struct kvm *mirror_kvm; > + struct kvm_sev_info *mirror_kvm_sev; > + unsigned int asid; > + int ret; > + > + if (!sev_guest(kvm)) > + return -ENOTTY; > + > + mutex_lock(&kvm->lock); > + > + /* Mirrors of mirrors should work, but let's not get silly */ > + if (is_mirroring_enc_context(kvm)) { > + ret = -ENOTTY; > + goto failed; > + } How will A->B->C->... type of live migration work if mirrors of mirrors are not supported ? Thanks, Ashish