Re: [RFC PATCH v2 00/21] QEMU gmem implemention

Xiaoyao Li <xiaoyao.li@xxxxxxxxx> · Fri, 15 Sep 2023 11:37:26 +0800

On 9/14/2023 9:09 PM, David Hildenbrand wrote:
On 14.09.23 05:50, Xiaoyao Li wrote:
It's the v2 RFC of enabling KVM gmem[1] as the backend for private
memory.

For confidential-computing, KVM provides gmem/guest_mem interfaces for
userspace, like QEMU, to allocate user-unaccesible private memory. This
series aims to add gmem support in QEMU's RAMBlock so that each RAM can
have both hva-based shared memory and gmem_fd based private memory. QEMU
does the shared-private conversion on KVM_MEMORY_EXIT and discards the
memory.

It chooses the design that adds "private" property to hostmeory backend.
If "private" property is set, QEMU will allocate/create KVM gmem when
initialize the RAMbloch of the memory backend.

This sereis also introduces the first user of kvm gmem,
KVM_X86_SW_PROTECTED_VM. A KVM_X86_SW_PROTECTED_VM with private KVM gmem
can be created with

   $qemu -object sw-protected-vm,id=sp-vm0 \
    -object memory-backend-ram,id=mem0,size=1G,private=on \
    -machine 
q35,kernel_irqchip=split,confidential-guest-support=sp-vm0,memory-backend=mem0 \
    ...

Unfortunately this patch series fails the boot of OVMF at very early
stage due to triple fault, because KVM doesn't support emulating 
string IO
to private memory.

Is support being added? Or have we figured out what it would take to 
make it work?

Hi David,

I only reply the questions that werrn't covered by Sean's reply.

How does this interact with other features (memory ballooning, virtiofs, 
vfio/mdev/...)?

I need time to learn them before I can answer it.

This version still leave some opens to be discussed:
1. whether we need "private" propery to be user-settable?

    It seems unnecessary because vm-type is determined. If the VM is
    confidential-guest, then the RAM of the guest must be able to be
    mapped as private, i.e., have kvm gmem backend. So QEMU can
    determine the value of "private" property automatiacally based on vm
    type.

    This also aligns with the board internal MemoryRegion that needs to
    have kvm gmem backend, e.g., TDX requires OVMF to act as private
    memory so bios memory region needs to have kvm gmem fd associated.
    QEMU no doubt will do it internally automatically.

Would it make sense to have some regions without "pivate" semantics? 
Like NVDIMMs?

Of course it can have regions without "private" semantics.

Whether a region needs "private" backend depends on the definition of VM 
type. E.g., for TDX,
 - all the RAM needs to able to mapped as private. So it needs private 
gmem.
 - TDVF(OVMF) code must be mapped as private. So it needs private gmem.
 - MMIO region needs to be shared for TDX 1.0, and it doesn't need 
private gmem;

2. hugepage support.

    KVM gmem can be allocated from hugetlbfs. How does QEMU determine
    when to allocate KVM gmem with KVM_GUEST_MEMFD_ALLOW_HUGEPAGE. The
    easiest solution is create KVM gmem with 
KVM_GUEST_MEMFD_ALLOW_HUGEPAGE
    only when memory backend is HostMemoryBackendFile of hugetlbfs.

Good question.

Probably "if the memory backend uses huge pages, also use huge pages for 
the private gmem" makes sense.

... but it becomes a mess with preallocation ... which is what people 
should actually be using with hugetlb. Andeventual double 
memory-consumption ... but maybe that's all been taken care of already?

Probably it's best to leave hugetlb support as future work and start 
with something minimal.

As Sean replied, I had some misunderstanding of 
KVM_GUEST_MEMFD_ALLOW_HUGEPAGE. If it's for THP, I think we can allow it 
for every gmem.

As for hugetlb, we can leave it as future work.