On Mon, 5 Feb 2018 17:10:18 +0100 Viktor Mihajlovski <mihajlov@xxxxxxxxxxxxxxxxxx> wrote: > On 05.02.2018 16:37, Luiz Capitulino wrote: > > On Mon, 5 Feb 2018 13:47:27 +0000 > > Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > > > >> On Mon, Feb 05, 2018 at 02:43:15PM +0100, Viktor Mihajlovski wrote: > >>> On 02.02.2018 21:41, Eduardo Habkost wrote: > >>>> On Fri, Feb 02, 2018 at 03:19:45PM -0500, Luiz Capitulino wrote: > >>>>> On Fri, 2 Feb 2018 18:09:12 -0200 > >>>>> Eduardo Habkost <ehabkost@xxxxxxxxxx> wrote: > >>>> [...] > >>>>>> Your plan above covers what will happen when using newer QEMU > >>>>>> versions, but libvirt still needs to work sanely if running QEMU > >>>>>> 2.11. My suggestion is that libvirt do not run query-cpus to ask > >>>>>> for the "halted" field on any architecture except s390. > >>>>> > >>>>> My current plan is to ask libvirt to completely remove query-cpus > >>>>> usage, independent of the arch and use the new command instead. > >>>> > >>>> This would be a regression for people running QEMU 2.11 on s390. > >>>> > >>>> (But maybe it would be an acceptable regression? Viktor, what do > >>>> you think? Are there production releases of management systems > >>>> that already rely on vcpu.<n>.halted?) > >>>> > >>> Unfortunately, there's code out there looking at vcpu.<n>.halted. I've > >>> informed the product team about the issue. > >>> > >>> If we drop/deprecate vcpu.<n>.halted from the domain statistics, this > >>> should be done for all arches, if there's a replacement mechanism (i.e. > >>> new VCPU states). As a stop-gap measure we can make the call > >>> arch-dependent until the new stuff is in place. > >> > >> Yes, I think libvirt should just restrict this 'halted' feature reporting > >> to s390 only, since the other archs have different semantics for this > >> item, and the s390 semantics are the ones we want. > > > > From this whole discussion, there's only one thing that I still don't > > understand (in a very honest way): what makes s390 halted semantics > > different?One problem is that using the halted property to indicate that the CPU > has assumed the architected disabled wait state may not have been the > wisest decision (my fault). If the CPU enters disabled wait, it will > stay inactive until it is explicitly restarted which is different on x86. Ah, OK. So, s390 does indeed have different semantics. > > By quickly looking at the code, it seems to be very like the x86 one > > when in kernel irqchip is not used: if a guest vCPU executes HLT, the > > vCPU exits to userspace and qemu will put the vCPU thread to sleep. > > This is the semantics I'd expect for HLT, and maybe for all archs.> > > What makes x86 different, is when the in kernel irqchip is used (which > > should be the default with libvirt). In this case, the vCPU thread avoids > > exiting to user-space. So, qemu doesn't know the vCPU halted. > > > > That's only one of the reasons why query-cpus forces vCPUs to user-space. > > But there are other reasons, and that's why even on s390 query-cpus > > will also force vCPUs to user-space, which means s390 has the same perf > > issue but maybe this hasn't been detected yet. > > > > For the immediate term, I still think we should have a query-cpus > > replacement that doesn't cause vCPUs to go to userspace. I'll work this > > this week. > FWIW: I currently exploring an extension to query-cpus to report > s390-specific information, allowing to ditch halted in the long run. > Further, I'm considering a new QAPI event along the lines of "CPU info > has changed" allowing QEMU to announce low-frequency changes of CPU > state (as is the case for s390) and finally wire up a handler in libvirt > to update a tbd. property (!= halted). I very much prefer adding a replacement for query-cpus, which works for all archs and which doesn't have any performance impact. > > > > However, IMHO, what we really want is to add an API to the guest agent > > to export the CPU online bit from the guest userspace sysfs. This will > > give the ultimate semantics and move us away from this halted mess. > > >