Re: [PATCH v2] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 27, 2022 at 2:21 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> Passing the host topology to the guest is almost certainly wrong
> and will confuse the scheduler.  In addition, several fields of
> these CPUID leaves vary on each processor; it is simply impossible to
> return the right values from KVM_GET_SUPPORTED_CPUID in such a way that
> they can be passed to KVM_SET_CPUID2.
>
> The values that will most likely prevent confusion are all zeroes.
> Userspace will have to override it anyway if it wishes to present a
> specific topology to the guest.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
>  Documentation/virt/kvm/api.rst | 14 ++++++++++++++
>  arch/x86/kvm/cpuid.c           | 32 ++++++++++++++++----------------
>  2 files changed, 30 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index eee9f857a986..20f4f6b302ff 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -8249,6 +8249,20 @@ CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by ``KVM_GET_SUPPORTED_CPUID``
>  It can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` is present and the kernel
>  has enabled in-kernel emulation of the local APIC.
>
> +CPU topology
> +~~~~~~~~~~~~
> +
> +Several CPUID values include topology information for the host CPU:
> +0x0b and 0x1f for Intel systems, 0x8000001e for AMD systems.  Different
> +versions of KVM return different values for this information and userspace
> +should not rely on it.  Currently they return all zeroes.
> +
> +If userspace wishes to set up a guest topology, it should be careful that
> +the values of these three leaves differ for each CPU.  In particular,
> +the APIC ID is found in EDX for all subleaves of 0x0b and 0x1f, and in EAX
> +for 0x8000001e; the latter also encodes the core id and node id in bits
> +7:0 of EBX and ECX respectively.
> +
>  Obsolete ioctls and capabilities
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0810e93cbedc..164bfb7e7a16 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -759,16 +759,22 @@ struct kvm_cpuid_array {
>         int nent;
>  };
>
> +static struct kvm_cpuid_entry2 *get_next_cpuid(struct kvm_cpuid_array *array)
> +{
> +       if (array->nent >= array->maxnent)
> +               return NULL;
> +
> +       return &array->entries[array->nent++];
> +}
> +
>  static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array,
>                                               u32 function, u32 index)
>  {
> -       struct kvm_cpuid_entry2 *entry;
> +       struct kvm_cpuid_entry2 *entry = get_next_cpuid(array);
>
> -       if (array->nent >= array->maxnent)
> +       if (!entry)
>                 return NULL;
>
> -       entry = &array->entries[array->nent++];
> -
>         memset(entry, 0, sizeof(*entry));
>         entry->function = function;
>         entry->index = index;
> @@ -945,22 +951,13 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>                 entry->edx = edx.full;
>                 break;
>         }
> -       /*
> -        * Per Intel's SDM, the 0x1f is a superset of 0xb,
> -        * thus they can be handled by common code.
> -        */
>         case 0x1f:
>         case 0xb:
>                 /*
> -                * Populate entries until the level type (ECX[15:8]) of the
> -                * previous entry is zero.  Note, CPUID EAX.{0x1f,0xb}.0 is
> -                * the starting entry, filled by the primary do_host_cpuid().
> +                * No topology; a valid topology is indicated by the presence
> +                * of subleaf 1.
>                  */
> -               for (i = 1; entry->ecx & 0xff00; ++i) {
> -                       entry = do_host_cpuid(array, function, i);
> -                       if (!entry)
> -                               goto out;
> -               }
> +               entry->eax = entry->ebx = entry->ecx = 0;
>                 break;
>         case 0xd: {
>                 u64 permitted_xcr0 = kvm_caps.supported_xcr0 & xstate_get_guest_group_perm();
> @@ -1193,6 +1190,9 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>                 entry->ebx = entry->ecx = entry->edx = 0;
>                 break;
>         case 0x8000001e:
> +               /* Do not return host topology information.  */
> +               entry->eax = entry->ebx = entry->ecx = 0;
> +               entry->edx = 0; /* reserved */
>                 break;
>         case 0x8000001F:
>                 if (!kvm_cpu_cap_has(X86_FEATURE_SEV)) {
> --
> 2.31.1
>

This is a userspace ABI change that breaks existing hypervisors.
Please don't do this. Userspace ABIs are supposed to be inviolate.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux