On 24/02/21 09:59, Nathan Tempelman wrote:
Add a capability for userspace to mirror SEV encryption context from
one vm to another. On our side, this is intended to support a
Migration Helper vCPU, but it can also be used generically to support
other in-guest workloads scheduled by the host. The intention is for
the primary guest and the mirror to have nearly identical memslots.
The primary benefits of this are that:
1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they
can't accidentally clobber each other.
2) The VMs can have different memory-views, which is necessary for post-copy
migration (the migration vCPUs on the target need to read and write to
pages, when the primary guest would VMEXIT).
This does not change the threat model for AMD SEV. Any memory involved
is still owned by the primary guest and its initial state is still
attested to through the normal SEV_LAUNCH_* flows. If userspace wanted
to circumvent SEV, they could achieve the same effect by simply attaching
a vCPU to the primary VM.
This patch deliberately leaves userspace in charge of the memslots for the
mirror, as it already has the power to mess with them in the primary guest.
This patch does not support SEV-ES (much less SNP), as it does not
handle handing off attested VMSAs to the mirror.
For additional context, we need a Migration Helper because SEV PSP migration
is far too slow for our live migration on its own. Using an in-guest
migrator lets us speed this up significantly.
Hello,
We've been thinking a lot about migrating confidential virtual machines
at IBM. Maybe you've seen the approach that we (Dov Murik and myself)
shared on the QEMU and OVMF mailing lists. In general, we have tried to
implement migration without kernel support, which has some drawbacks.
Mainly, it is difficult to dynamically start the migration handler
without kernel support, which puts stress on OVMF. If there is momentum
behind these KVM patches, we think they could go hand-in-hand with some
of the work that we have done.
I'm not sure if you have patches for a migration handler/helper or
hypervisor support. If you do, I'd be curious to see them. If not, maybe
we should try to converge some of the work that has already happened. I
think that no matter where the migration handler ends up running or how
it is started, it will do more or less the same things: export pages to
the HV and import pages from the HV. Similarly, the hypervisor is
probably going to need similar mechanisms to ask the MH for encrypted
pages. Given that we already have some of these things, maybe there is
a way to bring them together with this patch.
I also have a few specific questions about this patch.
I am not sure how the mirror VM will be supported in QEMU. Usually there
is one QEMU process per-vm. Now we would need to run a second VM and
communicate with it during migration. Is there a way to do this without
adding significant complexity?
You say that SEV-ES is not supported. While there are challenges
regarding setting the CPU state of the mirror, I think there may also be
larger issues with using the mirror for -ES. With plain SEV, the
migration handler only has to worry about guest memory. With SEV-ES the
MH will probably need to set the CPU state of the guest as well. It
seems difficult to do this with an MH that is in a separate VM entirely.
Is there an expectation that the mirror-based approach will ever work
with SEV-ES?
I am curious where you plan on putting the migration handler itself. We
were drawn to OVMF because it is measured by the PSP. Do you have some
alternate approach?
Do you plan to support consecutive migrations (target of first migration
is source of second)? This is really just a question about the lifetime
of the MH. Will the mirror VM be started and stopped dynamically or will
it persist for the life of the guest on both source and target?
Finally, do you plan to use AMD PSP-based migration to migrate parts of
the mirror VM or of the primary VM? The migration handler we've
developed does not use PSP-based migration at all; instead it relies on
secret injection to both source and target VMs to keep the migration
keys secure. There are trade-offs either way.
-Tobin