On Mon, Mar 04, 2024 at 09:54:40AM +0800, Xiaoyao Li wrote: > On 3/1/2024 6:17 PM, Gerd Hoffmann wrote: > > query kvm for supported guest physical address bits using > > KVM_CAP_VM_GPA_BITS. Expose the value to the guest via cpuid > > (leaf 0x80000008, eax, bits 16-23). > > > > Signed-off-by: Gerd Hoffmann <kraxel@xxxxxxxxxx> > > --- > > target/i386/cpu.h | 1 + > > target/i386/cpu.c | 1 + > > target/i386/kvm/kvm.c | 8 ++++++++ > > 3 files changed, 10 insertions(+) > > > > diff --git a/target/i386/cpu.h b/target/i386/cpu.h > > index 952174bb6f52..d427218827f6 100644 > > --- a/target/i386/cpu.h > > +++ b/target/i386/cpu.h > > @@ -2026,6 +2026,7 @@ struct ArchCPU { > > /* Number of physical address bits supported */ > > uint32_t phys_bits; > > + uint32_t guest_phys_bits; > > /* in order to simplify APIC support, we leave this pointer to the > > user */ > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > > index 2666ef380891..1a6cfc75951e 100644 > > --- a/target/i386/cpu.c > > +++ b/target/i386/cpu.c > > @@ -6570,6 +6570,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, > > if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) { > > /* 64 bit processor */ > > *eax |= (cpu_x86_virtual_addr_width(env) << 8); > > + *eax |= (cpu->guest_phys_bits << 16); > > I think you misunderstand this field. > > If you expose this field to guest, it's the information for nested guest. > i.e., the guest itself runs as a hypervisor will know its nested guest can > have guest_phys_bits for physical addr. I think those limits (l1 + l2 guest phys-bits) are identical, no? The problem this tries to solve is that making the guest phys-bits smaller than the host phys-bits is problematic (which why we have allow_smaller_maxphyaddr), but nevertheless there are cases where the usable guest physical address space is smaller than the host physical address space. One case is intel processors with phys-bits larger than 48 and 4-level EPT. Another case is amd processors with phys-bits larger than 48 and the l0 hypervisor using 4-level paging. The guest needs to know that limit, specifically the guest firmware so it knows where it can map PCI bars. take care, Gerd