On Tue, Apr 23, 2019 at 11:23:59AM +0800, Like Xu wrote: > On 2019/4/23 2:35, Sean Christopherson wrote: > >> #define F(x) bit(X86_FEATURE_##x) > >> int kvm_update_cpuid(struct kvm_vcpu *vcpu) > >>@@ -426,6 +436,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, > >> switch (function) { > >> case 0: > >> entry->eax = min(entry->eax, (u32)(f_intel_pt ? 0x14 : 0xd)); > >>+ entry->eax = kvm_supported_intel_mcp() ? 0x1f : entry->eax; > > > >This all seems unnecessary. And by 'all', I mean the existing Intel PT > >and XSAVE leaf checks, as well as the new mcp check. entry->eax comes > >directly from hardware, and unless I missed something, PT and XSAVE are > >only exposed to the guest when they're supported in hardware. In other > >words, KVM will never need to adjust entry->eax to expose PT or XSAVE. > > We call this function for both case KVM_GET_SUPPORTED_CPUID and > KVM_GET_EMULATED_CPUID although kvm user could reconfig them via > KVM_SET_CPUID* path. Not that it matters, but __do_cpuid_ent() is only used for the non-emulated case, KVM_GET_EMULATED_CPUID goes to __do_cpuid_ent_emulated(). > >The original min() check was added by commit 0771671749b5 ("KVM: Enhance > >guest cpuid management"), which doesn't provide any explicit information > >on why KVM does min() in the first place. > > Exposing cpuid.0.eax in a blind way (with host hardware support) > is not a good practice for guest migration and improves compatibility > requirements. Right, but isn't the f_intel_pt check for example completely irrelevant? f_intel_pt is true if and only if hardware supports PT, i.e. CPUID.0.EAX and thus entry->eax will already be >=0x14. I don't fully understand whether or not KVM needs to raise the minimum to 0xb regardless of h/w XSAVE support, but it's likely irrelevant in the end. Anyways, back to 0x1f, kvm_supported_intel_mcp() returns true if and only if hardware's CPUID.0.EAX >= 0x1f, i.e. adjusting entry->eax is always a nop. So if KVM wants to advertise leaf 0x1f only when it's supported in hardware then adjusting entry->eax is unnecessary, and if KVM wants to unconditionally advertise 0x1f then adjusting entry->eax should also be done unconditionally. > >Given that the original code > >was "entry->eax = min(entry->eax, (u32)0xb);", my *guess* is that the > >idea was to always report "Extended Topology Enumeration Leaf" as > >supported so that userspace can enumerate the VM's topology to the guest > >even when hardware itself doesn't do so. > > If the host cpu mode is too antiquated to support 0xb, it wouldn't report > 0xb for sure. The host cpuid.0.eax has been over 0xb for a long time and > reached 0x1f in the latest SDM. > > AFAICT, the original code keeps minimum cpuid.0.eax out of features guest > just used or at least it claimed to use. > > > > >Assuming we want to allow userspace to use "V2 Extended Topology > >Enumeration Leaf" regardless of hardware support, then this can simply be: > > > > entry->eax = min(entry->eax, (u32)0x1f); > > > >Or am I completely missing something?