Re: [PATCH] KVM: VMX: Nop emulation of MSR_IA32_POWER_CTL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 11 May 2019 at 01:17, Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Fri, May 10, 2019 at 11:34:41AM +0100, Joao Martins wrote:
> > On 5/10/19 10:54 AM, Wanpeng Li wrote:
> > > It is weird that we can observe intel_idle driver in the guest
> > > executes mwait eax=0x20, and the corresponding pCPU enters C3 on HSW
> > > server, however, we can't observe this on SKX/CLX server, it just
> > > enters maximal C1.
> >
> > I assume you refer to the case where you pass the host mwait substates to the
> > guests as is, right? Or are you zeroing/filtering out the mwait cpuid leaf EDX
> > like my patch (attached in the previous message) suggests?
> >
> > Interestingly, hints set to 0x20 actually corresponds to C6 on HSW (based on
> > intel_idle driver). IIUC From the SDM (see Vol 2B, "MWAIT for Power Management"
> > in instruction set reference M-U) the hints register, doesn't necessarily
> > guarantee the specified C-state depicted in the hints will be used. The manual
> > makes it sound like it is tentative, and implementation-specific condition may
> > either ignore it or enter a different one. It appears to be only guaranteed that
> > it won't enter a C-{sub,}state deeper than the one depicted.
>
> Yep, section "MWAIT EXTENSIONS FOR ADVANCED POWER MANAGEMENT" is more
> explicit on this point:
>
>   At CPL=0, system software can specify desired C-state and sub C-state by
>   using the MWAIT hints register (EAX).  Processors will not go to C-state
>   and sub C-state deeper than what is specified by the hint register.
>
> As for why SKX/CLX only enters C1, AFAICT SKX isn't configured to support
> C3, e.g. skx_cstates in drivers/idle/intel_idle.c shows C1, C1E and C6.
> A quick search brings up a variety of docs that confirm this.  My guess is
> that C1E provides better power/performance than C3 for the majority of
> server workloads, e.g. C3 doesn't provide enough power savings to justify
> its higher latency and TLB flush.

You are right, I figure this out by referring to the SKX/CLX EDS, the
Core C-States of these two generations just support CC0/CC1/CC1E/CC6.
The issue here is after exposing mwait to the guest, SKX/CLX guest
can't enter CC6, however, HSW guest can enter CC3/CC6. Both HSW and
SKX/CLX hosts can enter CC6. We observe SKX/CLX guests execute mwait
eax 0x20, however, we can't observe the corresponding pCPU enter CC6
by turbostat or reading MSR_CORE_C6_RESIDENCY directly.

Regards,
Wanpeng Li



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux