[RFC PATCH v2 00/10] KVM: Restricted mapping of guest_memfd at the host and pKVM/arm64 support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This series adds restricted mmap() support to guest_memfd, as
well as support for guest_memfd on pKVM/arm64. It is based on
Linux 6.10.

Main changes since V1 [1]:

- Decoupled whether guest memory is mappable from KVM memory
attributes (SeanC)

Mappability is now tracked in the guest_mem object, orthogonal to
whether userspace wants the memory to be private or shared.
Moreover, the memory attributes capability (i.e.,
KVM_CAP_MEMORY_ATTRIBUTES) is not enabled for pKVM, since for
software-based hypervisors such as pKVM and Gunyah, userspace is
informed of the state of the memory via hypervisor exits if
needed.

Even if attributes are enabled, this patch series would still
work (modulo bugs), without compromising guest memory nor
crashing the system.

- Use page_mapped() instead of page_mapcount() to check if page
is mapped (DavidH)

- Add a new capability, KVM_CAP_GUEST_MEMFD_MAPPABLE, to query
whether guest private memory can be mapped (with aforementioned
restrictions)

- Add a selftest to check whether memory is mappable when the
capability is enabled, and not mappable otherwise. Also, test the
effect of punching holes in mapped memory. (DavidH)

By design, guest_memfd cannot be mapped, read, or written by the
host. In pKVM, memory shared between a protected guest and the
host is shared in-place, unlike the other confidential computing
solutions that guest_memfd was originally envisaged for (e.g,
TDX). When initializing a guest, as well as when accessing memory
shared by the guest with the host, it would be useful to support
mapping that memory at the host to avoid copying its contents.

One of the benefits of guest_memfd is that it prevents a
misbehaving host process from crashing the system when attempting
to access (deliberately or accidentally) protected guest memory,
since this memory isn't mapped to begin with. Without
guest_memfd, the hypervisor would still prevent such accesses,
but in certain cases the host kernel wouldn't be able to recover,
causing the system to crash.

Support for mmap() in this patch series maintains the invariant
that only memory shared with the host, either explicitly by the
guest or implicitly before the guest has started running (in
order to populate its memory) is allowed to have a valid mapping
at the host. At no time should private (as viewed by the
hypervisor) guest memory be mapped at the host.

This patch series is divided into two parts:

The first part is to the KVM core code. It adds opt-in support
for mapping guest memory only as long as it is shared. For that,
the host needs to know the mappability status of guest memory.
Therefore, the series adds a structure to track whether memory is
mappable. This new structure is associated with each guest_memfd
object.

The second part of the series adds guest_memfd support for
pKVM/arm64.

We don't enforce the invariant that only memory shared with the
host can be mapped by the host userspace in
file_operations::mmap(), but we enforce it in
vm_operations_struct:fault(). On vm_operations_struct::fault(),
we check whether the page is allowed to be mapped. If not, we
deliver a SIGBUS to the current task, as discussed in the Linux
MM Alignment Session on this topic [2].

Currently there's no support for huge pages, which is something
we hope to add in the future, and seems to be a hot topic for the
upcoming LPC 2024 [3].

Cheers,
/fuad

[1] https://lore.kernel.org/all/20240222161047.402609-1-tabba@xxxxxxxxxx/

[2] https://lore.kernel.org/all/20240712232937.2861788-1-ackerleytng@xxxxxxxxxx/

[3] https://lpc.events/event/18/sessions/183/#20240919

Fuad Tabba (10):
  KVM: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock
  KVM: Add restricted support for mapping guestmem by the host
  KVM: Implement kvm_(read|/write)_guest_page for private memory slots
  KVM: Add KVM capability to check if guest_memfd can be mapped by the
    host
  KVM: selftests: guest_memfd mmap() test when mapping is allowed
  KVM: arm64: Skip VMA checks for slots without userspace address
  KVM: arm64: Do not allow changes to private memory slots
  KVM: arm64: Handle guest_memfd()-backed guest page faults
  KVM: arm64: arm64 has private memory support when config is enabled
  KVM: arm64: Enable private memory kconfig for arm64

 arch/arm64/include/asm/kvm_host.h             |   3 +
 arch/arm64/kvm/Kconfig                        |   1 +
 arch/arm64/kvm/mmu.c                          | 139 +++++++++-
 include/linux/kvm_host.h                      |  72 +++++
 include/uapi/linux/kvm.h                      |   3 +-
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../testing/selftests/kvm/guest_memfd_test.c  |  47 +++-
 virt/kvm/Kconfig                              |   4 +
 virt/kvm/guest_memfd.c                        | 129 ++++++++-
 virt/kvm/kvm_main.c                           | 253 ++++++++++++++++--
 10 files changed, 628 insertions(+), 24 deletions(-)


base-commit: 0c3836482481200ead7b416ca80c68a29cfdaabd
-- 
2.46.0.rc1.232.g9752f9e123-goog





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux