On 19.06.2015 14:27, Daniel Hansel wrote: > > > On 18.06.2015 15:41, Daniel P. Berrange wrote: >> On Wed, Jun 17, 2015 at 05:37:42PM +0200, Jiri Denemark wrote: >>> Hi all (and sorry for the long email), >>> >>> The current way QEMU driver handles guest CPU configuration is not >>> ideal. We detect host CPU capabilities only by querying the CPU and we >>> don't check with QEMU what features it supports. We don't check QEMU's >>> definitions of CPU models, which may be different from libvirt's >>> definitions. All this results in several issues: >>> >>> - guest CPU may change druing migration, save/restore >>> - libvirt may ask for a CPU which QEMU cannot provide; the guest will >>> see a slightly different CPU but libvirt client won't know about it >>> - libvirt may come up with a CPU that doesn't make sense and which won't >>> work for a guest (the guest may even crash) >>> >>> Although usually everything just works, it is very fragile. >> >> A third issue is that if there is no <cpu> in the guest config, we >> just delegate CPU choice to QEMU and then ignore any CPU checks when >> migrating. If libvirt owns the full CPU config, we'd probably want >> to also decide the default ourselves, so that we will always be able >> todo migrate CPU checks. >> >>> Since we want to fix all these issues, we need to: >>> - guarantee stable guest ABI (a single domain XML should always results >>> in the same guest ABI). Once a domain is started, its CPU definition >>> should never change (unless someone changes the XML, of course, >>> similar to, e.g. PCI addresses). However, there are a few exceptions: >>> - host-passthrough CPU mode will always result in "-cpu host" >>> - host-model CPU mode should recompute the CPU model on every start, >>> but the CPU must not change during migration >>> - always make sure QEMU provides the CPU we asked for. Starting a domain >>> should fail in case QEMU cannot provide exactly the CPU we asked for. >>> - provide usable host-model mode and custom mode with minimum match. We >>> need to generate CPU configurations that actually work, i.e., we need >>> to ask QEMU what CPU it can provide on current host rather than >>> requesting a bunch of features on top of a CPU model which does not >>> always match the host CPU. >>> >>> QEMU already provides or will soon provide everything we need to meet >>> these requirements: >>> - we can cover every configurable part of a CPU in our cpu_map.xml and >>> instead of asking QEMU for a specific CPU model we can use "-cpu >>> custom" with a fully specified CPU >>> - we can use the additional data about CPU models to choose the right >>> one for a host CPU >>> - when starting a domain we can check whether QEMU filtered out any of >>> the features we asked for and refuse to start the domain >>> - we can ask QEMU what would "-cpu host" look like and use that for >>> host-model and minimum match CPUs (it won't work for TCG mode, though, >>> but we can keep using the current CPUID detection code for TCG) >> >> In TCG mode of course, 'host-model' and 'host-passthrough' are >> effectively identical, and don't actually need the host to support >> all the featues, since TCG is fully emulated. Which means that you >> can migrated TCG guests to anyhost with any model :-) I wonder if >> we are probably accidentally restricting that today, becuase we >> assume KVM needs host support. >> >>> Once we start maintaining CPU models with all the details, we will >>> likely meet the same issues QEMU folks meet, i.e., we will need to fix >>> bugs in existing CPU models. And it's not just about adding removing CPU >>> features but also fixing other parameters, such as wrong level, etc. >>> It's clear every change will require a new CPU model to be defined. But >>> I think we should do it in a way that applications or users should not >>> need (if they don't want to) to care about it. I'm thinking about doing >>> something similar to machine types. Each CPU model could be defined in >>> several versions and a CPU specs without a version would be an alias to >>> the latest version. >> >> Agreed, I think that versioning CPU models, independantly of machine >> types makes sense. It is probably a little more complex - in most cases >> we'd increase the version, but in some cases I think we'd end up wanting >> to define new named models. For example, with the recent TSX scenario we >> had, using versions would not have been appropriate, because Intel in >> fact ship 2 variants of the silicon. So even with with versioning, we >> would still have wanted to introduce the noTSX variants of the models. >> >>> The problem is, we need to maintain backward compatibility and we should >>> avoid breaking existing domains (shouldn't we?) which just work even >>> though their guest CPUs do not exactly match the domain XML definitions. >> >> Yep breaking existing domains isn't too pleasant! >> >>> So either we need to define all existing CPU models in all their >>> variants used for various machine types and have a mapping between >>> (model without a version, machine type) to a specific version of the >>> model (which may be quite hard) or we need to be able to distinguish >>> between an existing domain and a new domain with no CPU model version. >>> While host-model and host-passthrough CPU modes are easy because they >>> are designed to change everytime a domain starts (which means we don't >>> need to be able to distinguish between existing and new domains), custom >>> CPU mode are tricky. Currently, the only at least a bit reasonable thing >>> which came to my mind is to have a new CPU mode, but it still seems >>> awkward so please share your ideas if you have any. >> >> Introducing a new CPU mode feels pretty unpleasant to me. >> >> Although it will certainly be tedious work, getting details of all the >> CPU variants for historical machine types should be doable I think. >> >>> BTW, I don't think we should try to expose every part of the CPU model >>> definitions in domain XML, they should remain hidden behind the CPU >>> model name. It would be hard to explain what each of the extra >>> parameters mean, each model would have to include them anyway since we >>> can't expect users to provide all the details correctly, and once >>> visible in domain XML it could encourage users to play with the values. >> >> Yeah, I don't think we need expose all the raw details. If people really >> badly want to be able to customize that, then we should instead look at >> how we could better enable the cpu_map.xml file to be admin extensible. > Hi Daniel and Jirka, just as a ping if you have missed my comment... > Hi, > > currently Michael Mueller (IBM) is working on an extension of QEMU to support CPU models for s390x platform. > During the discussion on the QEMU mailing list the implementation was done in a more common way to provide support for all platforms. > > According to that new implementation I have implemented a first version for libvirt to retrieve the CPU model(s) supported by QEMU on s390x. > Due to the fact that the discussion is ongoing my prototype is not ready to be tested yet. > > A short overview about the current prototype I have implemented (QEMU cpu model support patches from Michael Mueller required): > > 1. During start of libvirt daemon QEMU monitor is used to retrieve the CPU models (i.e. just model names, QEMU handles all other setting like features, etc.) QEMU is supporting. > 2. The supported CPU models are stored in libvirt's QEMU capabilities (and stored in the capabilities cache file). > 3. Each call of virConnectGetCPUModelNames() (i.e. qemuConnectGetCPUModelNames()) is retrieving the information from QEMU capabilities (cached or not) on s390x platform. > All other platforms remain on the currently implemented way to parse the cpu_map.xml. > > Depending on that implementation all requests to get CPU models (e.g. for CPU model comparison, CPU model listing) will lead to a more appropriate result (e.g. if a QEMU binary is exchanged by a QEMU > binary built manually). > >> >> Regards, >> Daniel >> > -- Mit freundlichen Grüßen / Kind regards Daniel Hansel IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list