2017-03-13 18:08+0200, Michael S. Tsirkin: > On Mon, Mar 13, 2017 at 04:46:20PM +0100, Radim Krčmář wrote: >> 2017-03-10 00:29+0200, Michael S. Tsirkin: >> > Some guests call mwait without checking the cpu flags. We currently >> > emulate that as a NOP but on VMX we can do better: let guest stop the >> > CPU until timer or IPI. CPU will be busy but that isn't any worse than >> > a NOP emulation. >> > >> > Note that mwait within guests is not the same as on real hardware >> > because you must halt if you want to go deep into sleep. >> >> SDM (25.3 CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION) >> says that "MWAIT operates normally". What is the reason why MWAIT >> inside VMX cannot reach the same states as MWAIT outside VMX? > > If you are going into a deep sleep state with huge latency you are > better off exiting and paying an extra microsecond latency > since a chance some other task will want to schedule seems higher. Oh, so MWAIT behavior is same and can reach deep sleep, just use-cases differ ... If the guest VCPU is running on isolated CPU, then you might want to reach a deep state to save power when there is no better use. >> > Thus it isn't >> > a good idea to use the regular MWAIT flag in CPUID for that. Add a flag >> > in the hypervisor leaf instead. >> > >> > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> >> > --- >> [...] >> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c >> > @@ -594,6 +594,9 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, >> > + if (this_cpu_has(X86_FEATURE_MWAIT)) >> > + entry->eax = (1 << KVM_FEATURE_MWAIT); >> >> I'd rather not add it as a paravirt feature: >> >> - MWAIT requires the software to provide a target state, but we're not >> doing anything to expose those states. > > Current linux guests just discover these states based on > CPU model, so we do expose enough info. Linux still filters the hardcoded hints through CPUID[5].edx, which is 0 in our case. >> The feature would need very constrained setup, which is hard to >> support > > Why would it? It works without any tweaking on several boxes > I own. MWAIT hints do not always mean the same, so they could lead to different power/performance tradeoffs than the applications expects. We should at least specify that the paravirt feature allows only hint 0. You probably don't run weird combinations of host/guest CPUs. >> - we've had requests to support MWAIT emulation for Linux and fully >> emulating MWAIT would be best. >> MWAIT is not going to enabled by default, of course; it would be >> targeted at LPAR-like uses of KVM. > > Yes I think this limited emulation is safe to enable by default. > Pretending mwait is equivalent to halt maybe isn't. Right, we must keep the VCPU thread running when emulating mwait as it is different from a hlt. >> What about keeping just the last hunk to improve OS X, for now? >> >> Thanks. > > IMHO if we have a new functionality we are better of creating > some way for guests to discover it is there. > > Do we really have to argue about a single bit in HV leaf? > What harm does it do? It adds code to both guest and hosts and needs documentation ... The bit is acceptable. I just see no point in having it when there already is a detection mechanism for mwait. In any case, this patch should also remove VM exits under SVM and add KVM_CAP_MWAIT for userspace. Userspace can then set the MWAIT feature if it wishes the guest to use it in a more standard way. I can do a cleanup due to unused VM exits on top of it. Thanks.