On 09/07/2018 11:27 AM, Andi Kleen wrote:
On Thu, Sep 06, 2018 at 07:30:54PM +0800, Wei Wang wrote:
This patch adds an interface to enable a guest to request KVM to save
and restore the lbr stack on vCPU context switching.
KVM couldn't capture the info about whether the guest is actively using
the lbr feature via the lbr enable bit in the debugctl MSR, because that
control bit is frequently enabled and disabled by the guest, and in some
csaes, it is disabled even when the guest is actively using the lbr
feature. For example, perf_pmu_sched_task in the guest disables the bit
before reading out the lbr stack. In this case, the bit is disabled though
the guest is still using the lbr feature.
So, a KVM-specific MSR, MSR_KVM_PV_LBR_CTRL, is used by the guest at a
proper place to tell KVM if the LBR is actively in use or not. Basically,
the lbr user callstack mode needs the lbr stack to be saved/restored on a
context switching, so we set the ACTIVE bit of MSR_KVM_PV_LBR_CTRL only
when the user callstack mode is used. The KVM hypervisor will add the lbr
stack save/restore support on vCPU switching after the ACTIVE bit is set.
PV is difficult because it requires changing all the users.
It needs changes of the guest driver, but remains transparent to guest
user applications (e.g. the perf tool).
Btw, we tested it, and it works in guest as good as on the native linux.
This was thought of as the hardest part of this work. Let me just
clarify it a little bit:
The fundamental function we want to achieve is
#1 when the vCPU is actively using the LBR feature, save/restore the
lbr stack when the vCPU is scheduled out/in;
#2 when the vCPU is NOT actively using the LBR feature, DON'T
save/restore the lbr stack when the vCPU is scheduled out/in;
The key problem we need to solve is: how does the host know if the guest
is actively using the lbr feature or not?
Maybe a better approach would be a lazy restore of the LBRs:
Don't restore the LBRs on context switch, but set the LBR MSRs to intercept.
Then on the first access restore the LBRs and allow direct access to the
MSRs again.
Also when the LBRs haven't been set to direct access the state doesn't
need to be saved.
This could achieve the above #1, but how would it solve #2 above? That
is, after the guest uses the lbr feature for a while, the lbr stack has
been passed through, then the guest doesn't use lbr any more, but the
vCPU will still save/restore on switching?
(Host cannot know that the guest is not using lbr by the debugctl[0],
the commit log above has some explanations about this)
Best,
Wei