On Fri, Sep 04, 2009 at 04:58:25PM +0200, Jiri Denemark wrote: > Firstly, CPU topology and all (actually all that libvirt knows about) CPU > features have to be advertised in host capabilities: > > <host> > <cpu> > ... > <features> > <feature>NAME</feature> > </features> > <topology> > <sockets>NUMBER_OF_SOCKETS</sockets> > <cores>CORES_PER_SOCKET</cores> > <threads>THREADS_PER_CORE</threads> > </topology> > </cpu> > ... > </host> FWIW, we already have the host topology sockets/core/threads exposed in the virNodeInfo API / struct, though I don't see any harm in having it in the node capabilities XML too, particularly since we put NUMA topology in there. > I'm not 100% sure we should represent CPU features as <feature>NAME</feature> > especially because some features are currently advertised as <NAME/>. However, > extending XML schema every time a new feature is introduced doesn't look like > a good idea at all. The problem is we can't get rid of <NAME/>-style features, > which would result in redundancy: > > <features> > <vmx/> > <feature>vmx</feature> > </features> > > But I think it's better than changing the schema to add new features. I think we need more than just the features in the capabilties XML though. eg, if an application wants to configure a guest with a CPU model of 'core2duo' then it needs to know whether the host OS is at least a 'core2duo' or a superset. In essence I think the host capabilities XML needs to be more closely aligned with your proposed guest XML, specifically including a base CPU model name, along with any additional features beyond the basic set provided by that model. Which brings me neatly to the next question The host capabilities XML for some random machine says the host CPU is a 'core2duo' + 'ssse3' + '3dnow'. There is a guest to be run with a XML config requesting 'pentium3' + 'ssse3' as a minimum requirement. Now pretend you are not a human who knows pentium3 is a sub-set of core2duo. How do we know whether it is possible to run the guest on that host ? We could say that we'll make 'virDomainCreate' just throw an error when you try to start a guest (or incoming migration, etc), but if we have a data center of many hosts, apps won't want to just try to start a guest on each host. They'll want some way to figure out equivalence between CPU + feature sets. Perhaps this suggests we want a virConnectCompareCPU(conn, "<guest cpu xml fragment>") which returns 0 if the CPU is not compatible (ie subset), 1 if it is identical, or 2 if it is a superset. If we further declare that host capabilities for CPU model follow the same schema a guest XML for CPU model, we can use this same API to test 2 separate hosts for equivalence and thus figure out the lowest common denominator between a set of hosts & also thus what guests are available for that set of hosts. For x86, this would require libvirt internal driver to have a xml -> cpuid convertor, but then we already need one of those if we've to implement this stuff for Xen and VMWare drivers so I don't see this as too bad. We also of course need a cpuid -> xml convertor to populate the host capabilities XML. For all this I'm thining we should have some basic external data files which map named CPUs to sets of CPUID features, and named flags to CPUID bits. Populate this with theset of CPUs QEMU knows about for now, and then we can extend this later simply by dropping in new data files. Back to your question about duplication: > <features> > <vmx/> > <feature>vmx</feature> > </features> Just ignore the fact that we have vmx, pae + svm features defined for now. Focus on determining what XML schema we want to use consistently across host + guest for describing a CPU model + features. Once that's determined, we'll just fill in the legacy vmx/pae/svm features based off the data for the new format and recommend in the docs not to use the old style. > Secondly, drivers which support detailed CPU specification have to advertise > it in guest capabilities. In case <features> are meant to be hypervisor > features, than it could look like: > > <guest> > ... > <features> > <cpu/> > </features> > </guest> > > But if they are meant to be CPU features, we need to come up with something > else: > > <guest> > ... > <cpu_selection/> > </guest> > > I'm not sure how to deal with named CPUs suggested by Dan. Either we need to > come up with global set of named CPUs and document what they mean or let > drivers specify their own named CPUs and advertise them through guest > capabilities: > > <guest> > ... > <cpu model="NAME"> > <feature>NAME</feature> > ... > </cpu> > </guest> > > The former approach would make matching named CPUs with those defined by a > hypervisor (such as qemu) quite hard. The latter could bring the need for > hardcoding features provided by specific CPU models or, in case we decide not > to provide a list of features for each CPU model, it can complicate > transferring a domain from one hypervisor to another. As mentioned above I think we want to define a set of named CPU models that can be used across all drivers. For non-x86 we can just follow the standard CPU model names in QEMU. For x86 since there's soo many possible models and new ones appearing all the time, we I think we should define a set of CPUs models starting off those in QEMU, but provide a way to add new models via data files defining CPU ID mapping. nternally to libvirt we'll need bi-directional CPUID<->Model+feature convertors to allow good support in all our drivers. Model+feature -> CPU ID is easy - that's just a lookup. CPU ID -> Model+feature is harder. We'd need to iterate over all known models, and do a CPU ID -> Model+feature conversion for each model. Then pick the one that resulted in the fewest named features, which will probably be the newest CPU model. This will ensure the XML will always be the most concise. > And finally, CPU may be configured in domain XML configuration: > > <domain> > ... > <cpu model="NAME"> > <topology> > <sockets>NUMBER_OF_SOCKETS</sockets> > <cores>CORES_PER_SOCKET</cores> > <threads>THREADS_PER_CORE</threads> > </topology> This bit about topology looks just fine. > <feature name="NAME" mode="set|check" value="on|off"/> > </cpu> > </domain> > > Mode 'check' checks physical CPU for the feature and refuses the domain to > start if it doesn't match. VCPU feature is set to the same value. Mode 'set' > just sets the VCPU feature. The <feature> bit is probably a little too verbose for my liking. <feature name='ssse3' policy='XXX'> With 'policy' allowing one of: - 'force' - set to '1', even if host doesn't have it - 'require' - set to '1', fail if host doesn't have it - 'optional' - set to '1', only if host has it - 'disable' - set to '0', even if host has it - 'forbid' - set to '0', fail if host has it 'force' is unlikely to be used but its there for completeness since Xen and VMWare allow it. 'forbid' is for cases where you disable the CPUID but an guest may still try to access it anyway and you don't want it to succeeed - if you used 'disable' the guest could still try to use the feature if the host supported it, even if masked out in CPUID. The final complication is the 'optional' flag here. If we set it to 'optional' and we boot the guest on a host that has this feature, then when tring to migrate this in essence becomes a 'require' feature flag since you can't take it away from a running guest. > Final note: <topology> could also be called <cpu_topology> to avoid confusion > with NUMA <topology>, which is used in host capabilities. However, I prefer > <cpu><topology>...</topology></cpu> over > <cpu><cpu_topology>...</cpu_topology></cpu>. <cpu_topology> is redundant naming - the context - within a <cpu> tag is more than sufficient to distinguish from host capbilities NUMA topology when using <topology>. Finally, throughout this discussion I'm assuming that for non-x86 archs we'll merely use the named CPU model and not bother about any features or flags beyond this - just strict equivalence....until someone who cares enough about those archs complains. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list