On Wed, Mar 13, 2024 at 7:31 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Mar 07, 2024, David Matlack wrote: > > On Thu, Mar 7, 2024 at 3:27 PM David Matlack <dmatlack@xxxxxxxxxx> wrote: > > > > > > On 2024-03-07 02:37 PM, Sean Christopherson wrote: > > > > On Thu, Mar 07, 2024, David Matlack wrote: > > > > > Create memslot 0 at 0x100000000 (4GiB) to avoid it overlapping with > > > > > KVM's private memslot for the APIC-access page. > > > > > > > > Any chance we can solve this by using huge pages in the guest, and adjusting the > > > > gorilla math in vm_nr_pages_required() accordingly? There's really no reason to > > > > use 4KiB pages for a VM with 256GiB of memory. That'd also be more represantitive > > > > of real world workloads (at least, I hope real world workloads are using 2MiB or > > > > 1GiB pages in this case). > > > > > > There are real world workloads that use TiB of RAM with 4KiB mappings > > > (looking at you SAP HANA). > > > > > > What about giving tests an explicit "start" GPA they can use? That would > > > fix max_guest_memory_test and avoid tests making assumptions about 4GiB > > > being a magically safe address to use. > > So, rather than more hardcoded addresses and/or a knob to control _all_ code > allocations, I think we should provide knob to say that MEM_REGION_PT should go > to memory above 4GiB. And to make memslot handling maintainable in the long term: > > 1. Add a knob to place MEM_REGION_PT at 4GiB (and as of this initial patch, > conditionally in their own memslot). > > 2. Use the PT_AT_4GIB (not the real name) knob for the various memstress tests > that need it. Making tests pick when to place page tables at 4GiB seems unnecessary. Tests that don't otherwise need a specific physical memory layout should be able to create a VM with any amount of memory and have it just work. It's also not impossible that a test has 4GiB+ .bss because the guest needs a big array for something. In that case we'd need a knob to move MEM_REGION_CODE above 4GiB on x86_64 as well. For x86_64 (which is the only architecture AFAIK that has a private memslot in KVM the framework can overlap with), what's the downside of always putting all memslots above 4GiB? > > 3. Formalize memslots 0..2 (CODE, DATA, and PT) as being owned by the library, > with memslots 3..MAX available for test usage. > > 4. Modify tests that assume memslots 1..MAX are available, i.e. force them to > start at MEM_REGION_TEST_DATA. I think MEM_REGION_TEST_DATA is just where the framework will satisfy test-initiated dynamic memory allocations. That's different from which slots are free for the test to use. But assuming I understand your intention, I agree in spirit... Tests should be allowed to use slots TEST_SLOT..MAX and physical addresses TEST_GPA..MAX. The framework should provide both TEST_SLOT and TEST_GPA (names pending), and existing tests should use those instead of random hard-coded values. > > 5. Use separate memslots for CODE, DATA, and PT by default. This will allow > for more precise sizing of the CODE and DATA slots. What do you mean by "[separate memslots] will allow for more precise sizing"? > > 6. Shrink the number of pages for CODE to a more reasonable number. Currently > vm_nr_pages_required() reserves 512 pages / 2MiB for per-VM assets, which > at a glance seems ridiculously excessive. > > 7. Use the PT_AT_4GIB knob in s390's CMMA test? I suspect it does memslot > shenanigans purely so that a low gfn (4096 in the test) is guaranteed to > be available. +Nico Hm, if this test _needs_ to use GFN 4096, then maybe the framework can give tests two regions 0..KVM_FRAMEWORK_GPA and TEST_GPA..MAX. If the test just needs any GFN then it can use TEST_GPA instead of 4096 << page_shift.