Re: [PATCH v4 09/13] KVM: selftests: aarch64: Add aarch64/page_fault_test

Ricardo Koller <ricarkol@xxxxxxxxxx> · Fri, 22 Jul 2022 10:19:05 -0700

On Thu, Jul 21, 2022 at 01:29:34AM +0000, Sean Christopherson wrote:
> On Fri, Jun 24, 2022, Ricardo Koller wrote:
> > Add a new test for stage 2 faults when using different combinations of
> > guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> > and types of faults (e.g., read on hugetlbfs with a hole). The next
> > commits will add different handling methods and more faults (e.g., uffd
> > and dirty logging). This first commit starts by adding two sanity checks
> > for all types of accesses: AF setting by the hw, and accessing memslots
> > with holes.
> > 
> > Signed-off-by: Ricardo Koller <ricarkol@xxxxxxxxxx>
> > ---
> 
> ...
> 
> > +/*
> > + * Create a memslot for test data (memslot[TEST]) and another one for PT
> > + * tables (memslot[PT]). This diagram show the resulting guest virtual and
> > + * physical address space when using 4K backing pages for the memslots, and
> > + * 4K guest pages.
> > + *
> > + *                   Guest physical            Guest virtual
> > + *
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *                  +--------------+          +-------------+
> > + * max_gfn - 0x1000 | TEST memslot |<---------+  test data  | 0xc0000000
> > + *                  +--------------+          +-------------+
> > + * max_gfn - 0x2000 |     gap      |<---------+     gap     | 0xbffff000
> > + *                  +--------------+          +-------------+
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *                  |  PT memslot  |          |             |
> > + *                  |              |          +-------------+
> > + * max_gfn - 0x6000 |              |<----+    |             |
> > + *                  +--------------+     |    |             |
> > + *                  |              |     |    | PTE for the |
> > + *                  |              |     |    | test data   |
> > + *                  |              |     +----+ page        | 0xb0000000
> > + *                  |              |          +-------------+
> > + *                  |              |          |             |
> > + *                  |              |          |             |
> > + *
> > + * Using different guest page sizes or backing pages will result in the
> > + * same layout but at different addresses. In particular, the memslot
> > + * sizes need to be multiple of the backing page sizes (e.g., 2MB).
> > + */
> > +static void setup_memslots(struct kvm_vm *vm, enum vm_guest_mode mode,
> > +		struct test_params *p)
> > +{
> > +	uint64_t backing_page_size = get_backing_src_pagesz(p->src_type);
> > +	uint64_t guest_page_size = vm_guest_mode_params[mode].page_size;
> > +	struct test_desc *test = p->test_desc;
> > +	uint64_t gap_gpa;
> > +	uint64_t alignment;
> > +	int i;
> > +
> > +	memslot[TEST].size = align_up(guest_page_size, backing_page_size);
> > +	/*
> > +	 * We need one guest page for the PT table containing the PTE (for
> > +	 * TEST_GVA), but might need more in case the higher level PT tables
> > +	 * were not allocated yet.
> > +	 */
> > +	memslot[PT].size = align_up(4 * guest_page_size, backing_page_size);
> > +
> > +	for (i = 0; i < NR_MEMSLOTS; i++) {
> > +		memslot[i].guest_pages = memslot[i].size / guest_page_size;
> > +		memslot[i].src_type = p->src_type;
> > +	}
> > +
> > +	/* Place the memslots GPAs at the end of physical memory */
> > +	alignment = max(backing_page_size, guest_page_size);
> > +	memslot[TEST].gpa = (vm->max_gfn - memslot[TEST].guest_pages) *
> > +		guest_page_size;
> > +	memslot[TEST].gpa = align_down(memslot[TEST].gpa, alignment);
> > +
> > +	/* Add a 1-guest_page gap between the two memslots */
> > +	gap_gpa = memslot[TEST].gpa - guest_page_size;
> > +	/* Map the gap so it's still adressable from the guest.  */
> > +	virt_pg_map(vm, TEST_GVA - guest_page_size, gap_gpa);
> > +
> > +	memslot[PT].gpa = gap_gpa - memslot[PT].size;
> > +	memslot[PT].gpa = align_down(memslot[PT].gpa, alignment);
> > +
> > +	vm_userspace_mem_region_add(vm, p->src_type, memslot[PT].gpa,
> > +			memslot[PT].idx, memslot[PT].guest_pages,
> > +			test->pt_memslot_flags);
> > +	vm_userspace_mem_region_add(vm, p->src_type, memslot[TEST].gpa,
> > +			memslot[TEST].idx, memslot[TEST].guest_pages,
> > +			test->test_memslot_flags);
> > +
> > +	for (i = 0; i < NR_MEMSLOTS; i++)
> > +		memslot[i].hva = addr_gpa2hva(vm, memslot[i].gpa);
> > +
> > +	/* Map the test TEST_GVA using the PT memslot. */
> > +	_virt_pg_map(vm, TEST_GVA, memslot[TEST].gpa, MT_NORMAL,
> > +			TEST_PT_SLOT_INDEX);
> 
> Use memslot[TEST].idx instead of TEST_PT_SLOT_INDEX to be consistent, though my
> preference would be to avoid this API.
> 
> IIUC, the goal is to test different backing stores for the memory the guest uses
> for it's page tables.  But do we care about testing that a single guest's page
> tables are backed with different types concurrently?

This test creates a new VM for each subtest, so an API like that would
definitely make this code simpler/nicer.

> If we don't care, and maybe
> even if we do, then my preference would be to enhance the __vm_create family of
> helpers to allow for specifying what backing type should be used for page tables,
> i.e. associate the info the VM instead of passing it around the stack.
> 
> One idea would be to do something like David Matlack suggested a while back and
> replace extra_mem_pages with a struct, e.g. struct kvm_vm_mem_params  That struct
> can then provide the necessary knobs to control how memory is allocated.  And then
> the lib can provide a global
> 
> 	struct kvm_vm_mem_params kvm_default_vm_mem_params;
> 

I like this idea, passing the info at vm creation.

What about dividing the changes in two.

	1. Will add the struct to "__vm_create()" as part of this
	series, and then use it in this commit. There's only one user

		dirty_log_test.c:   vm = __vm_create(mode, 1, extra_mem_pages);

	so that would avoid having to touch every test as part of this patchset.

	2. I can then send another series to add support for all the other
	vm_create() functions.

Alternatively, I can send a new series that does 1 and 2 afterwards.
WDYT?

> or whatever (probably a shorter name) for the tests that don't care.  And then
> down the road, if someone wants to control the backing type for code vs. data,
> we could and those flags to kvm_vm_mem_params and add vm_vaddr_alloc() wrappers
> to alloc code vs. data (with a default to data?).
> 
> I don't like the param approach because it bleeds implementation details that
> really shouldn't matter, e.g. which memslot is the default, into tests.  And it's
> not very easy to use, e.g. if a different test wants to do something similar,
> it would have to create its own memslot, populate the tables, etc...