Hi all, On Wed, 17 Apr 2019 at 03:18, Joao Martins <joao.m.martins@xxxxxxxxxx> wrote: > > On 4/16/19 4:40 PM, Liran Alon wrote: > >> On 16 Apr 2019, at 18:21, Liran Alon <liran.alon@xxxxxxxxxx> wrote: > >>> On 15 Apr 2019, at 21:17, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote: > >>> On Mon, Apr 15, 2019 at 06:45:26PM +0300, Liran Alon wrote: > >>> > >>> Technically, I think this is a Qemu bug. KVM reports all zeros for > >>> CPUID_MWAIT_LEAF when userspace queries KVM_GET_SUPPORTED_CPUID and > >>> KVM_GET_EMULATED_CPUID. And I think that's correct/desired, supporting > >>> MONITOR/MWAIT sub-features should be a separate enabling patch set. > >> > >> At some point in time Jim added commit df9cb9cc5bcd ("kvm: x86: Amend the KVM_GET_SUPPORTED_CPUID API documentation”) > >> which added the following paragraph to documentation: > >> "Note that certain capabilities, such as KVM_CAP_X86_DISABLE_EXITS, may > >> expose cpuid features (e.g. MONITOR) which are not supported by kvm in > >> its default configuration. If userspace enables such capabilities, it > >> is responsible for modifying the results of this ioctl appropriately.” > >> > >> It’s indeed not clear what it means to “modify the results of this ioctl *appropriately*” right? > >> It can either mean you just expose in CPUID[EAX=1].ECX support for MONITOR/MWAIT > >> or that you also expose CPUID_MWAIT_LEAF (CPUID[EAX=5]). > >> Both regardless of the value returned from KVM_GET_SUPPORTED_CPUID ioctl. > >> > >> Having said that, I tend to agree with you. > >> Instead of emulating this MSR in KVM, I think it it preferred to change QEMU to expose MONITOR/MWAIT support in CPUID[EAX=1].ECX > >> but in CPUID[EAX=5] init everything as in host besides ECX[0] which will be set to 0 to report we don’t support extensions. > >> (We still want to support range of monitor line size, whether we can treat interrupts as break-events for MWAIT and the supported C substates). > >> I will create this patch for QEMU. > > > > Actually on second thought, I will just remain with the KVM patch (that Paolo was nice enough to already queue). > > and not do this QEMU patch. This is because why will we want to prevent guest from specifying target C-State if he is exposed with MWAIT? > > I don’t see a reason we should prevent that. Do you? > > > One reason it is a good idea to prevent the guest from entering deeper > C-states (e.g. deeper than C2) is to allow preemption timer to be used again > when guests are exposed with MWAIT (currently we can't do that). It is weird that we can observe intel_idle driver in the guest executes mwait eax=0x20, and the corresponding pCPU enters C3 on HSW server, however, we can't observe this on SKX/CLX server, it just enters maximal C1. All the setups between HSW and SKX/CLX are the same. Sean and Liran, any opinions? Regards, Wanpeng Li