Re: [PATCH v3 01/12] cpu_map: update script to handle versioned CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 04, 2024 at 10:35:25AM -0700, Jim Fehlig wrote:
> On 3/1/24 10:13, Daniel P. Berrangé wrote:
> > On Fri, Mar 01, 2024 at 10:36:12AM -0600, Jonathon Jongsma wrote:
> > > On 3/1/24 10:13 AM, Daniel P. Berrangé wrote:
> > > > On Tue, Feb 20, 2024 at 05:08:02PM -0700, Jim Fehlig wrote:
> > > > > On 12/15/23 15:11, Jonathon Jongsma wrote:
> > > > > > Previously, the script only generated the parent CPU and any versions
> > > > > > that had a defined alias. The script now generates all CPU versions. Any
> > > > > > version that had a defined alias will continue to use that alias, but
> > > > > > those without aliases will use the generated name $BASECPUNAME-vN.
> > > > > > 
> > > > > > The reason for this change is two-fold. First, we need to add new models
> > > > > > that support new features (such as SEV-SNP). To deal with this, the
> > > > > > script now generates model definitions for all versions.
> > > > > > 
> > > > > > But we also need to ensure that our CPU definitions are migration-safe.
> > > > > > To deal with this issue we need to make sure we're always using the
> > > > > > canonical versioned names for CPUs.
> > > > > 
> > > > > Related to migration safety, do we need to be concerned with the expansion
> > > > > of 'host-model' CPU? E.g. is it possible 'host-model' expands to EPYC before
> > > > > introducing the new models, and EPYC-v4 afterwards? If so, what are the
> > > > > ramifications of that?
> > > > 
> > > > Yes, I see that happening on my laptop in domcapabilities:
> > > > 
> > > > Currently libvirt reports:
> > > > 
> > > >       <mode name='host-model' supported='yes'>
> > > >         <model fallback='forbid'>Snowridge</model>
> > > >         <vendor>Intel</vendor>
> > > >         <maxphysaddr mode='passthrough' limit='46'/>
> > > >         <feature policy='require' name='ss'/>
> > > >         <feature policy='require' name='vmx'/>
> > > >        ...snip...
> > > > 
> > > > 
> > > > and after this series it reports:
> > > > 
> > > >       <mode name='host-model' supported='yes'>
> > > >         <model fallback='forbid'>Snowridge-v4</model>
> > > >         <vendor>Intel</vendor>
> > > >         <maxphysaddr mode='passthrough' limit='46'/>
> > > >         <feature policy='require' name='ss'/>
> > > >         <feature policy='require' name='vmx'/>
> > > >        ...snip...
> > > > 
> > > > 
> > > > That's not wrong per-se, becasue Snowrigde-v4 has a smaller
> > > > delta against my host CPU.
> > > > 
> > > > The problem is that libvirt updates the *live* XML for the
> > > > guest with this expansion.  IIUC, if we now attempt to
> > > > live migrate to a compatible machine running older libvirt
> > > > the migrate will fail as old libvirt doesn't know the -v4
> > > > CPU.
> 
> Downstream, we (SUSE) don't really support migrating from new -> old. Is
> this something we aim to support upstream?

Kind of, sort of, yes and no :)

The VIR_DOMAIN_XML_MIGRATABLE flag is a bit of an attempt to make
it possible to format XML in a way that's (hopefully) mostly acceptable
to older libvirt.

The devil is in the detail though, and there's never really been
any formal testing to prove correctness, so new -> old is one of
those things that may work, please report bugs if we missed
something.

> > > > I'm not sure how to address this ?
> > > 
> > > But don't we have this issue any time we add a new CPU model to libvirt?
> > > Anytime there's a new model, it has the potential to be a closer match to
> > > the host CPU than an existing model definition was. As I mentioned in my
> > > previous reply, when e.g. the -noTSX CPU variants were added, didn't the
> > > same sort of thing (potentially) happen? Or am I doing something
> > > meaningfully different in this patch set than what happens in those
> > > scenarios?
> > 
> > I think it probably /did/ happen, but that doesn't make it acceptable.
> > The noTSX stuff was the cause of massive amounts of compatibility pain
> > for mgmt apps, so the incompatibility in libvirt might have been glossed
> > over. We're adding alot of new versions here, so the possibly increasing
> > the visibility/impact of this libvirt change.
> 
> It can happen when we introduce an entirely new CPU model too. E.g. on a
> Genoa machine, prior to commit bfe53e9145c, host model expanded to

Yeah, true, so that's a general problem with 'host-model' when
introducing new CPU generations, if that post-dates a user
deploying on said CPU generation..

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux