On 8/17/21 12:32 PM, Paolo Bonzini wrote:
On 17/08/21 01:53, Steve Rutherford wrote:
Separately, I'm a little weary of leaving the migration helper mapped
into the shared address space as writable.
A related question here is what the API should be for how the
migration helper sees the memory in both physical and virtual address.
First of all, I would like the addresses passed to and from the
migration helper to *not* be guest physical addresses (this is what I
referred to as QEMU's ram_addr_t in other messages). The reason is
that some unmapped memory regions, such as virtio-mem hotplugged
memory, would still have to be transferred and could be encrypted.
While the guest->host hypercall interface uses guest physical
addresses to communicate which pages are encrypted, the host can do
the GPA->ram_addr_t conversion and remember the encryption status of
currently-unmapped regions.
This poses a problem, in that the guest needs to prepare the page
tables for the migration helper and those need to use the migration
helper's physical address space.
There's three possibilities for this:
1) the easy one: the bottom 4G of guest memory are mapped in the
mirror VM 1:1. The ram_addr_t-based addresses are shifted by either
4G or a huge value such as 2^42 (MAXPHYADDR - physical address
reduction - 1). This even lets the migration helper reuse the OVMF
runtime services memory map (but be careful about thread safety...).
This is essentially what we do in our prototype, although we have an
even simpler approach. We have a 1:1 mapping that maps an address to
itself with the cbit set. During Migration QEMU asks the migration
handler to import/export encrypted pages and provides the GPA for said
page. Since the migration handler only exports/imports encrypted pages,
we can have the cbit set for every page in our mapping. We can still use
OVMF functions with these mappings because they are on encrypted pages.
The MH does need to use a few shared pages (to communicate with QEMU,
for instance), so we have another mapping without the cbit that is at a
large offset.
I think this is basically equivalent to what you suggest. As you point
out above, this approach does require that any page that will be
exported/imported by the MH is mapped in the guest. Is this a bad
assumption? The VMSA for SEV-ES is one example of a region that is
encrypted but not mapped in the guest (the PSP handles it directly). We
have been planning to map the VMSA into the guest to support migration
with SEV-ES (along with other changes).
2) the more future-proof one. Here, the migration helper tells QEMU
which area to copy from the guest to the mirror VM, as a (main GPA,
length, mirror GPA) tuple. This could happen for example the first
time the guest writes 1 to MSR_KVM_MIGRATION_CONTROL. When migration
starts, QEMU uses this information to issue KVM_SET_USER_MEMORY_REGION
accordingly. The page tables are built for this (usually very high)
mirror GPA and the migration helper operates in a completely separate
address space. However, the backing memory would still be shared
between the main and mirror VMs. I am saying this is more future
proof because we have more flexibility in setting up the physical
address space of the mirror VM.
The Migration Handler in OVMF is not a contiguous region of memory. The
MH uses OVMF helper functions that are allocated in various regions of
runtime memory. I guess I can see how separating the memory of the MH
and the guest OS could be positive. On the other hand, since the MH is
in OVMF, it is fundamentally designed to coexist with the guest OS.
What do you envision in terms of future changes to the mirror address space?
3) the paranoid one, which I think is what you hint at above: this is
an extension of (2), where userspace invokes the PSP send/receive API
to copy the small requested area of the main VM into the mirror VM.
The mirror VM code and data are completely separate from the main VM.
All that the mirror VM shares is the ram_addr_t data. Though I am not
even sure it is possible to use the send/receive API this way...
Yeah not sure if you could use the PSP for this.
-Tobin
What do you think?
Paolo