On 5/21/2020 2:37 PM, Xiaoyao Li wrote:
On 5/21/2020 1:28 PM, Tao Xu wrote:
On 5/21/2020 12:33 PM, Xiaoyao Li wrote:
On 5/21/2020 5:05 AM, Paolo Bonzini wrote:
On 20/05/20 18:07, Maxim Levitsky wrote:
This msr is only available when the host supports WAITPKG feature.
This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.
Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
---
arch/x86/kvm/x86.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe3a24fd6b263..9c507b32b1b77 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
continue;
+ break;
+ case MSR_IA32_UMWAIT_CONTROL:
+ if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
+ continue;
default:
break;
}
The patch is correct, and matches what is done for the other entries of
msrs_to_save_all. However, while looking at it I noticed that
X86_FEATURE_WAITPKG is actually never added, and that is because it was
also not added to the supported CPUID in commit e69e72faa3a0 ("KVM:
x86:
Add support for user wait instructions", 2019-09-24), which was before
the kvm_cpu_cap mechanism was added.
So while at it you should also fix that. The right way to do that
is to
add a
if (vmx_waitpkg_supported())
kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
+ Tao
I remember there is certainly some reason why we don't expose WAITPKG
to guest by default.
Tao, please help clarify it.
Thanks,
-Xiaoyao
Because in VM, umwait and tpause can put a (psysical) CPU into a power
saving state. So from host view, this cpu will be 100% usage by VM.
Although umwait and tpause just cause short wait(maybe 100
microseconds), we still want to unconditionally expose WAITPKG in VM.
I guess you typed "unconditionally" by mistake that you meant to say
"conditionally" in fact?
I am sorry, I mean:
By default, we don't expose WAITPKG to guest. For QEMU, we can use
"-overcommit cpu-pm=on" to use WAITPKG.