Re: [RFC PATCH 00/13] Add support for Mirror VM.

Ashish Kalra <ashish.kalra@xxxxxxx> · Wed, 18 Aug 2021 17:07:17 +0000

On Wed, Aug 18, 2021 at 02:06:25PM +0000, Ashish Kalra wrote:
> On Wed, Aug 18, 2021 at 12:37:32AM +0200, Paolo Bonzini wrote:
> > On Tue, Aug 17, 2021 at 11:54 PM Steve Rutherford
> > <srutherford@xxxxxxxxxx> wrote:
> > > > 1) the easy one: the bottom 4G of guest memory are mapped in the mirror
> > > > VM 1:1.  The ram_addr_t-based addresses are shifted by either 4G or a
> > > > huge value such as 2^42 (MAXPHYADDR - physical address reduction - 1).
> > > > This even lets the migration helper reuse the OVMF runtime services
> > > > memory map (but be careful about thread safety...).
> > >
> > > If I understand what you are proposing, this would only work for
> > > SEV/SEV-ES, since the RMP prevents these remapping games. This makes
> > > me less enthusiastic about this (but I suspect that's why you call
> > > this less future proof).
> > 
> > I called it less future proof because it allows the migration helper
> > to rely more on OVMF details, but those may not apply in the future.
> > 
> > However you're right about SNP; the same page cannot be mapped twice
> > at different GPAs by a single ASID (which includes the VM and the
> > migration helper). :( That does throw a wrench in the idea of mapping
> > pages by ram_addr_t(*), and this applies to both schemes.
> > 
> > Migrating RAM in PCI BARs is a mess anyway for SNP, because PCI BARs
> > can be moved and every time they do the migration helper needs to wait
> > for validation to happen. :(
> > 
> > Paolo
> > 
> > (*) ram_addr_t is not a GPA; it is constant throughout the life of the
> > guest and independent of e.g. PCI BARs. Internally, when QEMU
> > retrieves the dirty page bitmap from KVM it stores the bits indexed by
> > ram_addr_t (shifted right by PAGE_SHIFT).
> 
> With reference to SNP here, the mirror VM model seems to have a nice
> fit with SNP:
> 
> SNP will support the separate address spaces for main VM and mirror VMs
> implicitly, with the MH/MA running in VMPL0. 
> 

Need to correct this statement, there is no separate address space as
such, there is basically page level permission/protection between VMPL
levels. 

> Additionally, vTOM can be used to separate mirror VM and main VM memory,
> with private mirror VM memory below vTOM and all the shared stuff with
> main VM setup above vTOM. 
> 

I need to take back the above statement, memory above vTOM is basically
decrypted memory. 

Thanks,
Ashish

> The design here should probably base itself on this model to probably
> allow an easy future port to SNP and also make it more futurer-proof.

> > > > 2) the more future-proof one.  Here, the migration helper tells QEMU
> > > > which area to copy from the guest to the mirror VM, as a (main GPA,
> > > > length, mirror GPA) tuple.  This could happen for example the first time
> > > > the guest writes 1 to MSR_KVM_MIGRATION_CONTROL.  When migration starts,
> > > > QEMU uses this information to issue KVM_SET_USER_MEMORY_REGION
> > > > accordingly.  The page tables are built for this (usually very high)
> > > > mirror GPA and the migration helper operates in a completely separate
> > > > address space.  However, the backing memory would still be shared
> > > > between the main and mirror VMs.  I am saying this is more future proof
> > > > because we have more flexibility in setting up the physical address
> > > > space of the mirror VM.
> > >
> > > My intuition for this leans more on the host, but matches some of the
> > > bits you've mentioned in (2)/(3). My intuition would be to put the
> > > migration helper incredibly high in gPA space, so that it does not
> > > collide with the rest of the guest (and can then stay in the same
> > > place for a fairly long period of time without needing to poke a hole
> > > in the guest). Then you can leave the ram_addr_t-based addresses
> > > mapped normally (without the offsetting). All this together allows the
> > > migration helper to be orthogonal to the normal guest and normal
> > > firmware.
> > >
> > > In this case, since the migration helper has a somewhat stable base
> > > address, you can have a prebaked entry point and page tables
> > > (determined at build time). The shared communication pages can come
> > > from neighboring high-memory. The migration helper can support a
> > > straightforward halt loop (or PIO loop, or whatever) where it reads
> > > from a predefined page to find what work needs to be done (perhaps
> > > with that page depending on which CPU it is, so you can support
> > > multithreading of the migration helper). Additionally, having it high
> > > in memory makes it quite easy to assess who owns which addresses: high
> > > mem is under the purview of the migration helper and does not need to
> > > be dirty tracked. Only "low" memory can and needs to be encrypted for
> > > transport to the target side.
> > >
> > > --Steve
> > > >
> > > > Paolo
> > > >
> > >
> >