Re: [RFC PATCH 00/13] Add support for Mirror VM.

Tobin Feldman-Fitzthum <tobin@xxxxxxxxxxxxx> · Tue, 17 Aug 2021 16:50:53 -0400

On 8/17/21 12:32 PM, Paolo Bonzini wrote:
On 17/08/21 01:53, Steve Rutherford wrote:
Separately, I'm a little weary of leaving the migration helper mapped
into the shared address space as writable.

A related question here is what the API should be for how the 
migration helper sees the memory in both physical and virtual address.

First of all, I would like the addresses passed to and from the 
migration helper to *not* be guest physical addresses (this is what I 
referred to as QEMU's ram_addr_t in other messages).  The reason is 
that some unmapped memory regions, such as virtio-mem hotplugged 
memory, would still have to be transferred and could be encrypted.  
While the guest->host hypercall interface uses guest physical 
addresses to communicate which pages are encrypted, the host can do 
the GPA->ram_addr_t conversion and remember the encryption status of 
currently-unmapped regions.

This poses a problem, in that the guest needs to prepare the page 
tables for the migration helper and those need to use the migration 
helper's physical address space.

There's three possibilities for this:

1) the easy one: the bottom 4G of guest memory are mapped in the 
mirror VM 1:1.  The ram_addr_t-based addresses are shifted by either 
4G or a huge value such as 2^42 (MAXPHYADDR - physical address 
reduction - 1). This even lets the migration helper reuse the OVMF 
runtime services memory map (but be careful about thread safety...).

This is essentially what we do in our prototype, although we have an 
even simpler approach. We have a 1:1 mapping that maps an address to 
itself with the cbit set. During Migration QEMU asks the migration 
handler to import/export encrypted pages and provides the GPA for said 
page. Since the migration handler only exports/imports encrypted pages, 
we can have the cbit set for every page in our mapping. We can still use 
OVMF functions with these mappings because they are on encrypted pages. 
The MH does need to use a few shared pages (to communicate with QEMU, 
for instance), so we have another mapping without the cbit that is at a 
large offset.

I think this is basically equivalent to what you suggest. As you point 
out above, this approach does require that any page that will be 
exported/imported by the MH is mapped in the guest. Is this a bad 
assumption? The VMSA for SEV-ES is one example of a region that is 
encrypted but not mapped in the guest (the PSP handles it directly). We 
have been planning to map the VMSA into the guest to support migration 
with SEV-ES (along with other changes).

2) the more future-proof one.  Here, the migration helper tells QEMU 
which area to copy from the guest to the mirror VM, as a (main GPA, 
length, mirror GPA) tuple.  This could happen for example the first 
time the guest writes 1 to MSR_KVM_MIGRATION_CONTROL.  When migration 
starts, QEMU uses this information to issue KVM_SET_USER_MEMORY_REGION 
accordingly.  The page tables are built for this (usually very high) 
mirror GPA and the migration helper operates in a completely separate 
address space.  However, the backing memory would still be shared 
between the main and mirror VMs.  I am saying this is more future 
proof because we have more flexibility in setting up the physical 
address space of the mirror VM.

The Migration Handler in OVMF is not a contiguous region of memory. The 
MH uses OVMF helper functions that are allocated in various regions of 
runtime memory. I guess I can see how separating the memory of the MH 
and the guest OS could be positive. On the other hand, since the MH is 
in OVMF, it is fundamentally designed to coexist with the guest OS.

What do you envision in terms of future changes to the mirror address space?

3) the paranoid one, which I think is what you hint at above: this is 
an extension of (2), where userspace invokes the PSP send/receive API 
to copy the small requested area of the main VM into the mirror VM.  
The mirror VM code and data are completely separate from the main VM.  
All that the mirror VM shares is the ram_addr_t data. Though I am not 
even sure it is possible to use the send/receive API this way...

Yeah not sure if you could use the PSP for this.

-Tobin

What do you think?

Paolo