Re: [PATCH 9/9] KVM: selftests: Add option to run dirty_log_perf_test vCPUs in L2

David Matlack <dmatlack@xxxxxxxxxx> · Mon, 16 May 2022 16:47:55 -0700

On Mon, May 16, 2022 at 4:42 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> On Mon, May 16, 2022 at 03:34:28PM -0700, David Matlack wrote:
> > On Mon, May 16, 2022 at 3:17 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
> > >
> > > On Fri, Apr 29, 2022 at 06:39:35PM +0000, David Matlack wrote:
> > > > +static void perf_test_l1_guest_code(struct vmx_pages *vmx, uint64_t vcpu_id)
> > > > +{
> > > > +#define L2_GUEST_STACK_SIZE 64
> > > > +     unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
> > > > +     unsigned long *rsp;
> > > > +
> > > > +     GUEST_ASSERT(vmx->vmcs_gpa);
> > > > +     GUEST_ASSERT(prepare_for_vmx_operation(vmx));
> > > > +     GUEST_ASSERT(load_vmcs(vmx));
> > > > +     GUEST_ASSERT(ept_1g_pages_supported());
> > > > +
> > > > +     rsp = &l2_guest_stack[L2_GUEST_STACK_SIZE - 1];
> > > > +     *rsp = vcpu_id;
> > > > +     prepare_vmcs(vmx, perf_test_l2_guest_entry, rsp);
> > >
> > > Just to purely ask: is this setting the same stack pointer to all the
> > > vcpus?
> >
> > No, but I understand the confusion since typically selftests use
> > symbols like "l2_guest_code" that are global. But "l2_guest_stack" is
> > actually a local variable so it will be allocated on the stack. Each
> > vCPU runs on a separate stack, so they will each run with their own
> > "l2_guest_stack".
>
> Ahh that's correct!
>
> >
> > >
> > > > +
> > > > +     GUEST_ASSERT(!vmlaunch());
> > > > +     GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
> > > > +     GUEST_DONE();
> > > > +}
> > >
> > > [...]
> > >
> > > > +/* Identity map the entire guest physical address space with 1GiB Pages. */
> > > > +void nested_map_all_1g(struct vmx_pages *vmx, struct kvm_vm *vm)
> > > > +{
> > > > +     __nested_map(vmx, vm, 0, 0, vm->max_gfn << vm->page_shift, PG_LEVEL_1G);
> > > > +}
> > >
> > > Could max_gfn be large?  Could it consumes a bunch of pages even if mapping
> > > 1G only?
> >
> > Since the selftests only support 4-level EPT, this will use at most
> > 513 pages. If we add support for 5-level EPT we may need to revisit
> > this approach.
>
> It's just that AFAICT vm_alloc_page_table() is fetching from slot 0 for all
> kinds of pgtables including EPT.  I'm not sure whether there can be some
> failures conditionally with this because when creating the vm we're not
> aware of this consumption, so maybe we'd reserve the pages somehow so that
> we'll be sure to have those pages at least?

So far in my tests perf_test_util seemed to allocate enough pages in
slot 0 that this just worked, so I didn't bother to explicitly reserve
the extra pages. But that's just an accident waiting to happen as you
point out, so I'll fix that in v2.

>
> --
> Peter Xu
>