Hi, Thomas, On Fri, 2019-04-05 at 14:30 +0200, Thomas Gleixner wrote: > On Wed, 3 Apr 2019, Fenghua Yu wrote: > > @@ -1663,6 +1663,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct > > msr_data *msr_info) > > u32 index; > > > > switch (msr_info->index) { > > + case MSR_TEST_CTL: > > + if (!msr_info->host_initiated && > > + !(vcpu->arch.core_capability & CORE_CAP_SPLIT_LOCK_DETECT)) > > Errm? If the MSR_TEST_CTL is exposed to the guest via CPUID then the > rdmsr() in the guest is not supposed to fail just because > CORE_CAP_SPLIT_LOCK_DETECT is not set. you're right. > vmx->msr_test_ctl holds the proper value, which is either 0 or > CORE_CAP_SPLIT_LOCK_DETECT until more bits are supported. > > So the chek needs to be guest_cpuid_has(X86_FEATURE_CORE_CAPABILITY). I don't think so. There is no dependecy between guest_cpuid_has(X86_FEATURE_CORE_CAPABILITY) and MSR_TEST_CTL. guest_cpuid_has(X86_FEATURE_CORE_CAPABILITY) only indicates that guest has MSR CORE_CAPABILITY. Due to the fact that MSR_TEST_CTL is emulated with vmx->msr_test_ctl. I think it 's ok to always let userspace or guest to read MSR_TEST_CTL, it just returns the emulated value. Like you said, " vmx->msr_test_ctl holds the proper value, which is either 0 or CORE_CAP_SPLIT_LOCK_DETECT until more bits are supported." > > + return 1; > > + msr_info->data = vmx->msr_test_ctl; > > + break; > > #ifdef CONFIG_X86_64 > > case MSR_FS_BASE: > > msr_info->data = vmcs_readl(GUEST_FS_BASE); > > @@ -1797,6 +1803,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct > > msr_data *msr_info) > > u32 index; > > > > switch (msr_index) { > > + case MSR_TEST_CTL: > > + if (!(vcpu->arch.core_capability & CORE_CAP_SPLIT_LOCK_DETECT)) > > + return 1; > > Again, this wants to be a check for the availability of the MSR in the > guest cpuid and not to the CORE_CAP_SPLIT_LOCK_DETECT magic bit. > > > + > > + if (data & ~TEST_CTL_ENABLE_SPLIT_LOCK_DETECT) > > + return 1; > > and this one wants to be: > > if (data & vmx->msr_test_ctl_mask) > > so additional bits can be enabled later on at exactly one point in the > code, i.e. during vm init. Again, in wrmsr handler, since MSR_TEST_CTL is emulated, it can be considered that guest always has this MSR. It just needs to return 1 when reserved bits are set. Using vmx->msr_test_ctl_mask is a good idea. I will use it in next version. > > + vmx->msr_test_ctl = data; > > + break; > > > > +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx) > > +{ > > + u64 host_msr_test_ctl; > > + > > + /* if TEST_CTL MSR doesn't exist on the hardware, do nothing */ > > + if (rdmsrl_safe(MSR_TEST_CTL, &host_msr_test_ctl)) > > + return; > > Yikes. So on hosts which do not have MSR_TEST_CTL this takes a fault on > every VM enter. The host knows whether it has this MSR or not. > > Even if it exists there is no point to do the rdmsr on every vmenter. The > value should be cached per cpu. It only changes when: > > 1) #AC triggers in kernel space > > 2) The sysfs knob is flipped > > #1 It happens either _BEFORE_ or _AFTER_ atomic_switch_msr_test_ctl(). In > both cases the cached value is correct in atomic_switch_msr_test_ctl(). > > If it happens _AFTER_ atomic_switch_msr_test_ctl() then the VMEXIT will > restore the enabled bit, but there is nothing you can do about that. > > #2 CANNOT happen in the context of vcpu_run(). > > So this wants to be: > > host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache); You're right and thanks for your advice. I will take it in next version. > > + if (host_msr_test_ctl == vmx->msr_test_ctl) > > + clear_atomic_switch_msr(vmx, MSR_TEST_CTL); > > + else > > + add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl, > > + host_msr_test_ctl, false); > > +} > > + > > u64 kvm_get_core_capability(void) > > { > > - return 0; > > + u64 data; > > + > > + rdmsrl_safe(MSR_IA32_CORE_CAPABILITY, &data); > > if (!boot_cpu_has(X86_FEATURE_CORE_CAPABILITY)) > return 0; > > > + /* mask non-virtualizable functions */ > > + data &= CORE_CAP_SPLIT_LOCK_DETECT; > > Why? > > > + /* > > + * There will be a list of FMS values that have split lock detection > > + * but lack the CORE CAPABILITY MSR. In this case, set > > + * CORE_CAP_SPLIT_LOCK_DETECT since we emulate MSR CORE_CAPABILITY. > > That's completely wrong. X86_FEATURE_SPLIT_LOCK_DETECT is set when the > capability is enumerated in CPUID or it's set via the FMS quirk. > > So: > data = 0; > > + if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) > > + data |= CORE_CAP_SPLIT_LOCK_DETECT; > > + > > + return data; > > is absolutely sufficient. Right, X86_FEATURE_SPLIT_LOCK_DETECT is set in 2 situation: #1 the capability is enumerated by CORE_CAP_SPLIT_LOCK_DETECT (bit 5 of MSR_IA32_CORE_CAPABILITY) #2 via FMS list, in which those processors have split lock detection feature but don't have MSR_IA32_CORE_CAPABILITY. There two pathes work well for host, accurately for real FMS. However, when it comes to virtualization, we can emulate a different FMS of guest from host. Considering the following case: The host cpu is ICELAKE_MOBILE, which doesn't have X86_FEATURE_CORE_CAPABILITY but has X86_FEATURE_SPLIT_LOCK_DETECT. If we run a guest with cpu model, Skylake. It will hidden the feature X86_FEATURE_SPLIT_LOCK_DETECT from guest, since guest's MSR_IA32_CORE_CAPABILITY reports zero, and FMS of guest doesn't match the list. In this way, it failed to expose the X86_FEATURE_SPLIT_LOCK_DETECT to guest. Since MSR_IA32_CORE_CAPBILITY is emulated in software for guest. We can enforce using #1 to report X86_FEATURE_SPLIT_LOCK_DETECT of guest, thus guest can get rid of the limitation of the FMS list. > Thanks, > > tglx