On 2021/2/11 7:06, Sean Christopherson wrote:
Align the HVA for hugepage memslots to 1gb, as opposed to incorrectly
assuming all architectures' hugepages are 512*page_size.
For x86, multiplying by 512 is correct, but only for 2mb pages, e.g.
systems that support 1gb pages will never be able to use them for mapping
guest memory, and thus those flows will not be exercised.
For arm64, powerpc, and s390 (and mips?), hardcoding the multiplier to
512 is either flat out wrong, or at best correct only in certain
configurations.
Hardcoding the _alignment_ to 1gb is a compromise between correctness and
simplicity. Due to the myriad flavors of hugepages across architectures,
attempting to enumerate the exact hugepage size is difficult, and likely
requires probing the kernel.
But, there is no need for precision since a stronger alignment will not
prevent creating a smaller hugepage. For all but the most extreme cases,
e.g. arm64's 16gb contiguous PMDs, aligning to 1gb is sufficient to allow
KVM to back the guest with hugepages.
I have implemented a helper get_backing_src_pagesz() to get granularity
of different
backing src types (anonymous/thp/hugetlb) which is suitable for
different architectures.
See:
https://lore.kernel.org/lkml/20210225055940.18748-6-wangyanan55@xxxxxxxxxx/
if it looks fine for you, maybe we can use the accurate page sizes for
GPA/HVA alignment:).
Thanks,
Yanan
Add the new alignment in kvm_util.h so that it can be used by callers of
vm_userspace_mem_region_add(), e.g. to also ensure GPAs are aligned.
Cc: Ben Gardon <bgardon@xxxxxxxxxx>
Cc: Yanan Wang <wangyanan55@xxxxxxxxxx>
Cc: Andrew Jones <drjones@xxxxxxxxxx>
Cc: Peter Xu <peterx@xxxxxxxxxx>
Cc: Aaron Lewis <aaronlewis@xxxxxxxxxx>
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
tools/testing/selftests/kvm/include/kvm_util.h | 13 +++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 4 +---
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h
index 4b5d2362a68a..a7dbdf46aa51 100644
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -68,6 +68,19 @@ enum vm_guest_mode {
#define MIN_PAGE_SIZE (1U << MIN_PAGE_SHIFT)
#define PTES_PER_MIN_PAGE ptes_per_page(MIN_PAGE_SIZE)
+/*
+ * KVM_UTIL_HUGEPAGE_ALIGNMENT is selftest's required alignment for both host
+ * and guest addresses when backing guest memory with hugepages. This is not
+ * the exact size of hugepages, rather it's a size that should allow backing
+ * the guest with hugepages on all architectures. Precisely tracking the exact
+ * sizes across all architectures is more pain than gain, e.g. x86 supports 2mb
+ * and 1gb hugepages, arm64 supports 2mb and 1gb hugepages when using 4kb pages
+ * and 512mb hugepages when using 64kb pages (ignoring contiguous TLB entries),
+ * powerpc radix supports 1gb hugepages when using 64kb pages, s390 supports 1mb
+ * hugepages, and so on and so forth.
+ */
+#define KVM_UTIL_HUGEPAGE_ALIGNMENT (1ULL << 30)
+
#define vm_guest_mode_string(m) vm_guest_mode_string[m]
extern const char * const vm_guest_mode_string[];
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index deaeb47b5a6d..2e497fbab6ae 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -18,7 +18,6 @@
#include <unistd.h>
#include <linux/kernel.h>
-#define KVM_UTIL_PGS_PER_HUGEPG 512
#define KVM_UTIL_MIN_PFN 2
/*
@@ -670,7 +669,6 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
{
int ret;
struct userspace_mem_region *region;
- size_t huge_page_size = KVM_UTIL_PGS_PER_HUGEPG * vm->page_size;
size_t alignment;
TEST_ASSERT(vm_adjust_num_guest_pages(vm->mode, npages) == npages,
@@ -733,7 +731,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm,
if (src_type == VM_MEM_SRC_ANONYMOUS_THP ||
src_type == VM_MEM_SRC_ANONYMOUS_HUGETLB)
- alignment = max(huge_page_size, alignment);
+ alignment = max((size_t)KVM_UTIL_HUGEPAGE_ALIGNMENT, alignment);
else
ASSERT_EQ(src_type, VM_MEM_SRC_ANONYMOUS);