Re: [PATCH v2] KVM: X86: Emulate APERF/MPERF to report actual vCPU frequency

Jim Mattson <jmattson@xxxxxxxxxx> · Thu, 6 Jan 2022 10:01:07 -0800

On Wed, Jan 5, 2022 at 7:29 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
>
> On 6/1/2022 6:51 am, Jim Mattson wrote:
> > On Thu, Dec 30, 2021 at 11:48 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
> >>
> >> On 31/12/2021 9:29 am, Jim Mattson wrote:
> >
> >>> At sched-in:
> >>> 1. Save host APERF/MPERF values from the MSRs.
> >>> 2. Load the "current" guest APERF/MPERF values into the MSRs (if the
> >>> vCPU configuration allows for unintercepted reads).
> >>>
> >>> At sched-out:
> >>> 1. Calculate the guest APERF/MPERF deltas for use in step 3.
> >>> 2. Save the "current" guest APERF/MPERF values.
> >>> 3. "Restore" the host APERF/MPERF values, but add in the deltas from step 1.
> >>>
> >>> Without any writes to IA32_MPERF, I would expect these MSRs to be
> >>> synchronized across all logical processors, and the proposal above
> >>> would break that synchronization.
> >
> > I am learning more about IA32_APERF and IA32_MPERF this year. :-)
>
> Uh, thanks for your attention.
>
> >
> > My worry above is unfounded. These MSRs only increment in C0, so they
> > are not likely to be synchronized.
> >
> > This also raises another issue with your original fast-path
> > implementation: the host MSRs will continue to count while the guest
> > is halted. However, the guest MSRs should not count while the guest is
> > halted.
> >
>
> The emulation based on guest TSC semantics w/ low precision may work it out.
> TBH, I still haven't given up on the idea of a pass-through approach.

I believe that pass-through can work for IA32_APERF. It can also work
for IA32_MPERF on AMD hosts or when the TSC multiplier is 1 on Intel
hosts. However, I also believe that it requires KVM to load the
hardware MSRs with the guest's values prior to VM-entry, and to update
the hardware MSRs with newly calculated host values before any other
consumers on the host may read them.