On Wed, Aug 04, 2021, Maxim Levitsky wrote: > Hi! > > I recently triaged a series of failures that I am seeing on both of my AMD machines in the kvm selftests. > > One test failed due to a trivial typo, to which I had sent a fix, but most of the other tests failed > due to what I now suspect to be a very minor but still a CPU bug. > > All of the failing tests except two tests that timeout (and I haven't yet triaged them), > use the perf_test_util.c library. > All of these fail with SHUTDOWN exit reason. > > After a relatively recent commit ef4c9f4f6546 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()"), > vm_get_max_gfn() was fixed to return the maximum GFN that the guest can use. > For default VM type this value is obtained from 'vm->pa_bit's which is in turn obtained > from guest's cpuid in kvm_get_cpu_address_width function. > > It is 48 on both my AMD machines (3970X and 4650U) and also on remote EPYC 7302P machine. > (all of them are Zen2 machines) > > My 3970X has SME enabled by BIOS, while my 4650U doesn't have it enabled. > The 7302P also has SME enabled. > SEV was obviously not enabled for the test. > NPT was enabled. > > It appears that if the guest uses any GPA above 0xFFFCFFFFF000 in its guest paging tables, > then it gets #PF with reserved bits error code. LOL, I encountered this joy a few weeks back. There's a magic Hyper-Transport region at the top of memory that is reserved, even for GPAs. You and I say "CPU BUG!!!", AMD says "working as intended" ;-) https://lkml.kernel.org/r/20210625020354.431829-2-seanjc@xxxxxxxxxx