On 4/15/2024 10:58 PM, Sean Christopherson wrote:
On Mon, Apr 15, 2024, Xiaoyao Li wrote:
On 4/12/2024 11:48 PM, Sean Christopherson wrote:
On Fri, Apr 12, 2024, Xiaoyao Li wrote:
If we go deep enough, it becomes a functional problem. It's not even _that_
ridiculous/contrived :-)
L1 KVM is still aware that the real MAXPHYADDR=52, and so there are no immediate
issues with reserved bits at that level.
But L1 userspace will unintentionally configure L2 with CPUID.0x8000_0008.EAX[7:0]=48,
and so L2 KVM will incorrectly think bits 51:48 are reserved. If both L0 and L1
are using TDP, neither L0 nor L1 will intercept #PF. And because L1 userspace
was told MAXPHYADDR=48, it won't know that KVM needs to be configured with
allow_smaller_maxphyaddr=true in order for the setup to function correctly.
In this case, a) L1 userspace was told by L1 KVM that MAXPHYADDR = 48 via
KVM_GET_SUPPORTED_CPUID. But b) L1 userspace gets MAXPHYADDR = 52 by
executing CPUID itself.
KVM can't assume userspace will do raw CPUID.
So the KVM ABI is that, KVM_GET_SUPPORTED_CPUID always reports the
host's MAXPHYADDR, if userspace wants to configure a smaller one than it
for guest and expect it functioning, it needs to set
kvm_intel.allower_smaller_maxphyaddr ?