Re: [PATCH v3 01/12] cpu_map: update script to handle versioned CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/4/24 11:35 AM, Jim Fehlig wrote:
On 3/1/24 10:13, Daniel P. Berrangé wrote:
On Fri, Mar 01, 2024 at 10:36:12AM -0600, Jonathon Jongsma wrote:
On 3/1/24 10:13 AM, Daniel P. Berrangé wrote:
On Tue, Feb 20, 2024 at 05:08:02PM -0700, Jim Fehlig wrote:
On 12/15/23 15:11, Jonathon Jongsma wrote:
Previously, the script only generated the parent CPU and any versions
that had a defined alias. The script now generates all CPU versions. Any
version that had a defined alias will continue to use that alias, but
those without aliases will use the generated name $BASECPUNAME-vN.

The reason for this change is two-fold. First, we need to add new models
that support new features (such as SEV-SNP). To deal with this, the
script now generates model definitions for all versions.

But we also need to ensure that our CPU definitions are migration-safe.
To deal with this issue we need to make sure we're always using the
canonical versioned names for CPUs.

Related to migration safety, do we need to be concerned with the expansion of 'host-model' CPU? E.g. is it possible 'host-model' expands to EPYC before introducing the new models, and EPYC-v4 afterwards? If so, what are the
ramifications of that?

Yes, I see that happening on my laptop in domcapabilities:

Currently libvirt reports:

      <mode name='host-model' supported='yes'>
        <model fallback='forbid'>Snowridge</model>
        <vendor>Intel</vendor>
        <maxphysaddr mode='passthrough' limit='46'/>
        <feature policy='require' name='ss'/>
        <feature policy='require' name='vmx'/>
       ...snip...


and after this series it reports:

      <mode name='host-model' supported='yes'>
        <model fallback='forbid'>Snowridge-v4</model>
        <vendor>Intel</vendor>
        <maxphysaddr mode='passthrough' limit='46'/>
        <feature policy='require' name='ss'/>
        <feature policy='require' name='vmx'/>
       ...snip...


That's not wrong per-se, becasue Snowrigde-v4 has a smaller
delta against my host CPU.

The problem is that libvirt updates the *live* XML for the
guest with this expansion.  IIUC, if we now attempt to
live migrate to a compatible machine running older libvirt
the migrate will fail as old libvirt doesn't know the -v4
CPU.

Downstream, we (SUSE) don't really support migrating from new -> old. Is this something we aim to support upstream?

I don't know the answer to this question.



I'm not sure how to address this ?

But don't we have this issue any time we add a new CPU model to libvirt?
Anytime there's a new model, it has the potential to be a closer match to
the host CPU than an existing model definition was. As I mentioned in my
previous reply, when e.g. the -noTSX CPU variants were added, didn't the
same sort of thing (potentially) happen? Or am I doing something
meaningfully different in this patch set than what happens in those
scenarios?

I think it probably /did/ happen, but that doesn't make it acceptable.
The noTSX stuff was the cause of massive amounts of compatibility pain
for mgmt apps, so the incompatibility in libvirt might have been glossed
over. We're adding alot of new versions here, so the possibly increasing
the visibility/impact of this libvirt change.

It can happen when we introduce an entirely new CPU model too. E.g. on a Genoa machine, prior to commit bfe53e9145c, host model expanded to

  <cpu mode='custom' match='exact' check='full'>
     <model fallback='forbid'>EPYC-Milan</model>
     <vendor>AMD</vendor>
     <feature policy='require' name='x2apic'/>
     <feature policy='require' name='tsc-deadline'/>
     <feature policy='require' name='hypervisor'/>
     <feature policy='require' name='tsc_adjust'/>
     <feature policy='require' name='avx512f'/>
     <feature policy='require' name='avx512dq'/>
     <feature policy='require' name='avx512ifma'/>
     <feature policy='require' name='avx512cd'/>
     <feature policy='require' name='avx512bw'/>
     <feature policy='require' name='avx512vl'/>
     <feature policy='require' name='avx512vbmi'/>
     <feature policy='require' name='avx512vbmi2'/>
     <feature policy='require' name='gfni'/>
     <feature policy='require' name='vaes'/>
     <feature policy='require' name='vpclmulqdq'/>
     <feature policy='require' name='avx512vnni'/>
     <feature policy='require' name='avx512bitalg'/>
     <feature policy='require' name='avx512-vpopcntdq'/>
     <feature policy='require' name='la57'/>
     <feature policy='require' name='spec-ctrl'/>
     <feature policy='require' name='stibp'/>
     <feature policy='require' name='arch-capabilities'/>
     <feature policy='require' name='ssbd'/>
     <feature policy='require' name='avx512-bf16'/>
     <feature policy='require' name='cmp_legacy'/>
     <feature policy='require' name='virt-ssbd'/>
     <feature policy='require' name='rdctl-no'/>
     <feature policy='require' name='skip-l1dfl-vmentry'/>
     <feature policy='require' name='mds-no'/>
     <feature policy='require' name='pschange-mc-no'/>
     <feature policy='disable' name='svm'/>
     <feature policy='require' name='topoext'/>
     <feature policy='disable' name='npt'/>
     <feature policy='disable' name='nrip-save'/>
     <feature policy='disable' name='svme-addr-chk'/>
   </cpu>

After commit bfe53e9145c

<cpu mode='custom' match='exact' check='full'>
     <model fallback='forbid'>EPYC-Genoa</model>
     <vendor>AMD</vendor>
     <feature policy='require' name='x2apic'/>
     <feature policy='require' name='tsc-deadline'/>
     <feature policy='require' name='hypervisor'/>
     <feature policy='require' name='tsc_adjust'/>
     <feature policy='require' name='spec-ctrl'/>
     <feature policy='require' name='stibp'/>
     <feature policy='require' name='arch-capabilities'/>
     <feature policy='require' name='ssbd'/>
     <feature policy='require' name='cmp_legacy'/>
     <feature policy='require' name='virt-ssbd'/>
     <feature policy='require' name='rdctl-no'/>
     <feature policy='require' name='skip-l1dfl-vmentry'/>
     <feature policy='require' name='mds-no'/>
     <feature policy='require' name='pschange-mc-no'/>
     <feature policy='disable' name='svm'/>
     <feature policy='require' name='topoext'/>
     <feature policy='disable' name='npt'/>
     <feature policy='disable' name='nrip-save'/>
     <feature policy='disable' name='svme-addr-chk'/>
   </cpu>

Regards,
Jim

Does anybody have a response to this point from Jim? I can't really think of a way forward if it's not acceptable for the host model expansion to change between different versions of libvirt.

Jonathon
_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux