On 2024-03-07 02:37 PM, Sean Christopherson wrote: > On Thu, Mar 07, 2024, David Matlack wrote: > > Create memslot 0 at 0x100000000 (4GiB) to avoid it overlapping with > > KVM's private memslot for the APIC-access page. > > This is going to cause other problems, e.g. from max_guest_memory_test.c > > /* > * Skip the first 4gb and slot0. slot0 maps <1gb and is used to back > * the guest's code, stack, and page tables. Because selftests creates > * an IRQCHIP, a.k.a. a local APIC, KVM creates an internal memslot > * just below the 4gb boundary. This test could create memory at > * 1gb-3gb,but it's simpler to skip straight to 4gb. > */ > const uint64_t start_gpa = SZ_4G; > > Trying to move away from starting at '0' is going to be problematic/annoying, > e.g. using low memory allows tests to safely assume 4GiB+ is always available. > And I'd prefer not to make the infrastucture all twisty and weird for all tests > just because memstress tests want to play with huge amounts of memory. > > Any chance we can solve this by using huge pages in the guest, and adjusting the > gorilla math in vm_nr_pages_required() accordingly? There's really no reason to > use 4KiB pages for a VM with 256GiB of memory. That'd also be more represantitive > of real world workloads (at least, I hope real world workloads are using 2MiB or > 1GiB pages in this case). There are real world workloads that use TiB of RAM with 4KiB mappings (looking at you SAP HANA). What about giving tests an explicit "start" GPA they can use? That would fix max_guest_memory_test and avoid tests making assumptions about 4GiB being a magically safe address to use. e.g. Something like this on top: diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index 194963e05341..584ac6fea65c 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -101,6 +101,11 @@ struct kvm_vm { unsigned int page_shift; unsigned int pa_bits; unsigned int va_bits; + /* + * Tests are able to use the guest physical address space from + * [available_base_gpa, max_gfn << page_shift) for their own purposes. + */ + vm_paddr_t available_base_gpa; uint64_t max_gfn; struct list_head vcpus; struct userspace_mem_regions regions; diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index c8d7e66d308d..e74d9efa82c2 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -17,6 +17,7 @@ #include <sys/stat.h> #include <unistd.h> #include <linux/kernel.h> +#include <linux/sizes.h> #define KVM_UTIL_MIN_PFN 2 @@ -414,6 +415,7 @@ struct kvm_vm *__vm_create(struct vm_shape shape, uint32_t nr_runnable_vcpus, uint64_t nr_pages = vm_nr_pages_required(shape.mode, nr_runnable_vcpus, nr_extra_pages); struct userspace_mem_region *slot0; + vm_paddr_t ucall_mmio_gpa; struct kvm_vm *vm; int i; @@ -436,7 +438,15 @@ struct kvm_vm *__vm_create(struct vm_shape shape, uint32_t nr_runnable_vcpus, * MMIO region would prevent silently clobbering the MMIO region. */ slot0 = memslot2region(vm, 0); - ucall_init(vm, slot0->region.guest_phys_addr + slot0->region.memory_size); + ucall_mmio_gpa = slot0->region.guest_phys_addr + slot0->region.memory_size; + ucall_init(vm, ucall_mmio_gpa); + + /* + * 1GiB is somewhat arbitrary, but is chosen to be large enough to meet + * most tests' alignment requirements/expectations. + */ + vm->available_base_gpa = + SZ_1G * DIV_ROUND_UP(ucall_mmio_gpa + vm->page_size, SZ_1G); kvm_arch_vm_post_create(vm); diff --git a/tools/testing/selftests/kvm/max_guest_memory_test.c b/tools/testing/selftests/kvm/max_guest_memory_test.c index 6628dc4dda89..f5d77d2a903d 100644 --- a/tools/testing/selftests/kvm/max_guest_memory_test.c +++ b/tools/testing/selftests/kvm/max_guest_memory_test.c @@ -156,18 +156,10 @@ static void calc_default_nr_vcpus(void) int main(int argc, char *argv[]) { - /* - * Skip the first 4gb and slot0. slot0 maps <1gb and is used to back - * the guest's code, stack, and page tables. Because selftests creates - * an IRQCHIP, a.k.a. a local APIC, KVM creates an internal memslot - * just below the 4gb boundary. This test could create memory at - * 1gb-3gb,but it's simpler to skip straight to 4gb. - */ - const uint64_t start_gpa = SZ_4G; const int first_slot = 1; struct timespec time_start, time_run1, time_reset, time_run2; - uint64_t max_gpa, gpa, slot_size, max_mem, i; + uint64_t start_gpa, max_gpa, gpa, slot_size, max_mem, i; int max_slots, slot, opt, fd; bool hugepages = false; struct kvm_vcpu **vcpus; @@ -229,6 +221,7 @@ int main(int argc, char *argv[]) for (i = 0; i < slot_size; i += vm->page_size) ((uint8_t *)mem)[i] = 0xaa; + start_gpa = vm->available_base_gpa; gpa = 0; for (slot = first_slot; slot < max_slots; slot++) { gpa = start_gpa + ((slot - first_slot) * slot_size);