On 7/17/2019 12:03 AM, Eduardo Habkost wrote:
On Fri, Jul 12, 2019 at 04:29:06PM +0800, Tao Xu wrote:
UMWAIT and TPAUSE instructions use IA32_UMWAIT_CONTROL at MSR index E1H
to determines the maximum time in TSC-quanta that the processor can reside
in either C0.1 or C0.2.
This patch emulates MSR IA32_UMWAIT_CONTROL in guest and differentiate
IA32_UMWAIT_CONTROL between host and guest. The variable
mwait_control_cached in arch/x86/power/umwait.c caches the MSR value, so
this patch uses it to avoid frequently rdmsr of IA32_UMWAIT_CONTROL.
Co-developed-by: Jingqi Liu <jingqi.liu@xxxxxxxxx>
Signed-off-by: Jingqi Liu <jingqi.liu@xxxxxxxxx>
Signed-off-by: Tao Xu <tao3.xu@xxxxxxxxx>
---
[...]
+static void atomic_switch_umwait_control_msr(struct vcpu_vmx *vmx)
+{
+ if (!vmx_has_waitpkg(vmx))
+ return;
+
+ if (vmx->msr_ia32_umwait_control != umwait_control_cached)
+ add_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL,
+ vmx->msr_ia32_umwait_control,
+ umwait_control_cached, false);
How exactly do we ensure NR_AUTOLOAD_MSRS (8) is still large enough?
I see 3 existing add_atomic_switch_msr() calls, but the one at
atomic_switch_perf_msrs() is in a loop. Are we absolutely sure
that perf_guest_get_msrs() will never return more than 5 MSRs?
Quote the code of intel_guest_get_msrs:
static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
{
[...]
arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
if (x86_pmu.flags & PMU_FL_PEBS_ALL)
arr[0].guest &= ~cpuc->pebs_enabled;
else
arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
*nr = 1;
if (x86_pmu.pebs && x86_pmu.pebs_no_isolation) {
[...]
arr[1].msr = MSR_IA32_PEBS_ENABLE;
arr[1].host = cpuc->pebs_enabled;
arr[1].guest = 0;
*nr = 2;
[...]
There are most 2 msrs now. By default umwait is disabled in KVM. So by
default there is no MSR_IA32_UMWAIT_CONTROL added into
add_atomic_switch_msr().
Thanks.
+ else
+ clear_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL);
+}
+
static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
{
vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
[...]