Re: [kvm-unit-tests PATCH 05/12] nSVM: Remove NPT reserved bits tests (new one on the way)

Sean Christopherson <seanjc@xxxxxxxxxx> · Thu, 24 Jun 2021 17:43:37 +0000

On Thu, Jun 24, 2021, Paolo Bonzini wrote:
> On 22/06/21 23:00, Sean Christopherson wrote:
> > Remove two of nSVM's NPT reserved bits test, a soon-to-be-added test will
> > provide a superset of their functionality, e.g. the current tests are
> > limited in the sense that they test a single entry and a single bit,
> > e.g. don't test conditionally-reserved bits.
> > 
> > The npt_rsvd test in particular is quite nasty as it subtly relies on
> > EFER.NX=1; dropping the test will allow cleaning up the EFER.NX weirdness
> > (it's forced for_all_  tests, presumably to get the desired PFEC.FETCH=1
> > for this one test).
> > 
> > Signed-off-by: Sean Christopherson<seanjc@xxxxxxxxxx>
> > ---
> >   x86/svm_tests.c | 45 ---------------------------------------------
> >   1 file changed, 45 deletions(-)
> 
> This exposes a KVM bug, reproducible with
> 
> 	./x86/run x86/svm.flat -smp 2 -cpu max,+svm -m 4g \
> 		-append 'npt_rw npt_rw_pfwalk'

Any chance you're running against an older KVM version?  The test passes if I
run against a build with my MMU pile on top of kvm/queue, but fails on a random
older KVM.

Side topic, these tests all fail to invalidate TLB entries after modifying PTEs.
I suspect they work in part because KVM flushes and syncs on all nested SVM
transitions...

> While running npt_rw_pfwalk, the #NPF gets an incorrect EXITINFO2
> (address for the NPF location; on my machine it gets 0xbfede6f0 instead of
> 0xbfede000).  The same tests work with QEMU from git.
> 
> I didn't quite finish analyzing it, but my current theory is
> that KVM receives a pagewalk NPF for a *different* page walk that is caused
> by read-only page tables; then it finds that the page walk to 0xbfede6f0
> *does fail* (after all the correct and wrong EXITINFO2 belong to the same pfn)
> and therefore injects it anyway.  This theory is because the 0x6f0 offset in
> the page table corresponds to the 0xde000 part of the faulting address.
> Maxim will look into it while I'm away.
> 
> Paolo
>