Re: Rename restrictedmem => guardedmem? (was: Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM)

Ackerley Tng <ackerleytng@xxxxxxxxxx> · Fri, 05 May 2023 19:39:36 +0000

Hi Sean,

Thanks for implementing this POC!

I’ve started porting the selftests (both Chao’s and those I added [1]).

guest mem seems to cover the use cases that have been discussed and
proposed so far, but I still need to figure out how gmem can work with

+ hugetlbfs
+ specification of/storing memory policy (for NUMA node bindings)
+ memory accounting - we may need to account for memory used separately,
  so that guest mem shows up separately on /proc/meminfo and similar
  places.

One issue I’ve found so far is that the pointer to kvm (gmem->kvm) is
not cleaned up, and hence it is possible to crash the host kernel in the
following way

1. Create a KVM VM
2. Create a guest mem fd on that VM
3. Create a memslot with the guest mem fd (hence binding the fd to the
   VM)
4. Close/destroy the KVM VM
5. Call fallocate(PUNCH_HOLE) on the guest mem fd, which uses gmem->kvm
   when it tries to do invalidation.

I then tried to clean up the gmem->kvm pointer during unbinding when the
KVM VM is destroyed.

That works, but then I realized there’s a simpler way to use the pointer
after freeing:

1. Create a KVM VM
2. Create a guest mem fd on that VM
3. Close/destroy the KVM VM
4. Call fallocate(PUNCH_HOLE) on the guest mem fd, which uses gmem->kvm
   when it tries to do invalidation.

Perhaps binding should mean setting the gmem->kvm pointer in addition to
gmem->bindings. This makes binding and unbinding symmetric and avoids
the use-after-frees described above.

This also means that creating a guest mem fd is no longer dependent on
the VM. Perhaps we can make creating a gmem fd a system ioctl (like
KVM_GET_API_VERSION and KVM_CREATE_VM) instead of a vm ioctl?

[1]  
https://lore.kernel.org/all/cover.1678926164.git.ackerleytng@xxxxxxxxxx/T/

Ackerley