From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> The objective of this RFC patch series is to develop a uAPI aimed at (pre)populating guest memory for various use cases and underlying VM technologies. - Pre-populate guest memory to mitigate excessive KVM page faults during guest boot [1], a need not limited to any specific technology. - Pre-populating guest memory (including encryption and measurement) for confidential guests [2]. SEV-SNP, TDX, and SW-PROTECTED VM. Potentially other technologies and pKVM. The patches are organized as follows. - 1: documentation on uAPI KVM_MAP_MEMORY. - 2: archtechture-independent implementation part. - 3-4: refactoring of x86 KVM MMU as preparation. - 5: x86 Helper function to map guest page. - 6: x86 KVM arch implementation. - 7: Add x86-ops necessary for TDX and SEV-SNP. - 8: selftest for validation. Discussion point: uAPI design: - access flags Access flags are needed for the guest memory population. We have options for their exposure to uAPI. - option 1. Introduce access flags, possibly with the addition of a private access flag. - option 2. Omit access flags from UAPI. Allow the kernel to deduce the necessary flag based on the memory slot and its memory attributes. - SEV-SNP and byte vs. page size The SEV correspondence is SEV_LAUNCH_UPDATE_DATA. Which dictates memory regions to be in 16-byte alignment, not page size. Should we define struct kvm_memory_mapping in bytes rather than page size? struct kvm_sev_launch_update_data { __u64 uaddr; __u32 len; }; - TDX and measurement The TDX correspondence is TDH.MEM.PAGE.ADD and TDH.MR.EXTEND. TDH.MEM.EXTEND extends its measurement by the page contents. Option 1. Add an additional flag like KVM_MEMORY_MAPPING_FLAG_EXTEND to issue TDH.MEM.EXTEND Option 2. Don't handle extend. Let TDX vendor specific API KVM_EMMORY_ENCRYPT_OP to handle it with the subcommand like KVM_TDX_EXTEND_MEMORY. - TDX and struct kvm_memory_mapping:source While the current patch series doesn't utilize kvm_memory_mapping::source member. TDX needs it to specify the source of memory contents. Implementation: - x86 KVM MMU In x86 KVM MMU, I chose to use kvm_mmu_do_page_fault(). It's not confined to KVM TDP MMU. We can restrict it to KVM TDP MMU and introduce an optimized version. [1] https://lore.kernel.org/all/65262e67-7885-971a-896d-ad9c0a760907@xxxxxxxxx/ [2] https://lore.kernel.org/all/6a4c029af70d41b63bcee3d6a1f0c2377f6eb4bd.1690322424.git.isaku.yamahata@xxxxxxxxx Thanks, Isaku Yamahata (8): KVM: Document KVM_MAP_MEMORY ioctl KVM: Add KVM_MAP_MEMORY vcpu ioctl to pre-populate guest memory KVM: x86/mmu: Introduce initialier macro for struct kvm_page_fault KVM: x86/mmu: Factor out kvm_mmu_do_page_fault() KVM: x86/mmu: Introduce kvm_mmu_map_page() for prepopulating guest memory KVM: x86: Implement kvm_arch_{, pre_}vcpu_map_memory() KVM: x86: Add hooks in kvm_arch_vcpu_map_memory() KVM: selftests: x86: Add test for KVM_MAP_MEMORY Documentation/virt/kvm/api.rst | 36 +++++ arch/x86/include/asm/kvm-x86-ops.h | 2 + arch/x86/include/asm/kvm_host.h | 6 + arch/x86/kvm/mmu.h | 3 + arch/x86/kvm/mmu/mmu.c | 30 ++++ arch/x86/kvm/mmu/mmu_internal.h | 70 +++++---- arch/x86/kvm/x86.c | 83 +++++++++++ include/linux/kvm_host.h | 4 + include/uapi/linux/kvm.h | 15 ++ tools/include/uapi/linux/kvm.h | 14 ++ tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/x86_64/map_memory_test.c | 136 ++++++++++++++++++ virt/kvm/kvm_main.c | 74 ++++++++++ 13 files changed, 448 insertions(+), 26 deletions(-) create mode 100644 tools/testing/selftests/kvm/x86_64/map_memory_test.c base-commit: 6a108bdc49138bcaa4f995ed87681ab9c65122ad -- 2.25.1