On 6/23/21 6:05 PM, Sean Christopherson wrote: > A few fixes centered around enumerating guest MAXPHYADDR and handling the > C-bit in KVM. > > DISCLAIMER: I have no idea if patch 04, "Truncate reported guest > MAXPHYADDR to C-bit if SEV is" is architecturally correct. The APM says > the following about the C-bit in the context of SEV, but I can't for the > life of me find anything in the APM that clarifies whether "effectively > reduced" is supposed to apply to _only_ SEV guests, or any guest on an > SEV enabled platform. > > Note that because guest physical addresses are always translated through > the nested page tables, the size of the guest physical address space is > not impacted by any physical address space reduction indicated in > CPUID 8000_001F[EBX]. If the C-bit is a physical address bit however, > the guest physical address space is effectively reduced by 1 bit. > > In practice, I have observed that Rome CPUs treat the C-bit as reserved for > non-SEV guests (another disclaimer on this below). Long story short, commit > ef4c9f4f6546 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()") > exposed the issue by inadvertantly causing selftests to start using GPAs > with bit 47 set. > > That said, regardless of whether or not the behavior is intended, it needs > to be addressed by KVM. I think the only difference is whether this is > KVM's _only_ behavior, or whether it's gated by an erratum flag. > > The second disclaimer is that I haven't tested with memory encryption > disabled in hardware. I wrote the patch assuming/hoping that only CPUs > that report SEV=1 treat the C-bit as reserved, but I haven't actually > tested the SEV=0 case on e.g. CPUs with only SME (we might have these > platforms, but I've no idea how to access/find them), or CPUs with SME/SEV > disabled in BIOS (again, I've no idea how to do this with our BIOS). Here's an explanation of the physical address reduction for bare-metal and guest. With MSR 0xC001_0010[SMEE] = 0: No reduction in host or guest max physical address. With MSR 0xC001_0010[SMEE] = 1: - Reduction in the host is enumerated by CPUID 0x8000_001F_EBX[11:6], regardless of whether SME is enabled in the host or not. So, for example on EPYC generation 2 (Rome) you would see a reduction from 48 to 43. - There is no reduction in physical address in a legacy guest (non-SEV guest), so the guest can use a 48-bit physical address - There is a reduction of only the encryption bit in an SEV guest, so the guest can use up to a 47-bit physical address. This is why the Qemu command line sev-guest option uses a value of 1 for the "reduced-phys-bits" parameter. Thanks, Tom > > Sean Christopherson (7): > KVM: x86: Use guest MAXPHYADDR from CPUID.0x8000_0008 iff TDP is > enabled > KVM: x86: Use kernel's x86_phys_bits to handle reduced MAXPHYADDR > KVM: x86: Truncate reported guest MAXPHYADDR to C-bit if SEV is > supported > KVM: x86/mmu: Do not apply HPA (memory encryption) mask to GPAs > KVM: VMX: Refactor 32-bit PSE PT creation to avoid using MMU macro > KVM: x86/mmu: Bury 32-bit PSE paging helpers in paging_tmpl.h > KVM: x86/mmu: Use separate namespaces for guest PTEs and shadow PTEs > > arch/x86/kvm/cpuid.c | 38 +++++++++++++++++--- > arch/x86/kvm/mmu.h | 11 ++---- > arch/x86/kvm/mmu/mmu.c | 63 ++++++++------------------------- > arch/x86/kvm/mmu/mmu_audit.c | 6 ++-- > arch/x86/kvm/mmu/mmu_internal.h | 14 ++++++++ > arch/x86/kvm/mmu/paging_tmpl.h | 52 ++++++++++++++++++++++++++- > arch/x86/kvm/mmu/spte.c | 2 +- > arch/x86/kvm/mmu/spte.h | 34 +++++++----------- > arch/x86/kvm/mmu/tdp_iter.c | 6 ++-- > arch/x86/kvm/mmu/tdp_mmu.c | 2 +- > arch/x86/kvm/svm/svm.c | 37 ++++++++++++++----- > arch/x86/kvm/vmx/vmx.c | 2 +- > arch/x86/kvm/x86.c | 3 ++ > arch/x86/kvm/x86.h | 1 + > 14 files changed, 170 insertions(+), 101 deletions(-) >