> > No. Again, KVM _should never_ manipulate VMX MSRs in response to CPUID changes. > Keeping the existing behavior would be done purely to maintain backwards > compability with existing userspace, not because it's strictly the right thing to do. > > E.g. as a strawman, a weird userspace could do KVM_SET_MSRS => KVM_SET_CPUID => > KVM_SET_CPUID, where the first KVM_SET_CPUID reset to a base config and the second > KVM_SET_CPUID incorporates "optional" features. In that case, clearing bits in > the VMX MSRs on the first KVM_SET_CPUID would do the wrong thing if the second > KVM_SET_CPUID enabled the relevant features. > > AFAIK, no userspace actually does something odd like that, whereas there are VMMs > that do KVM_SET_MSRS before KVM_SET_CPUID, e.g. disable a feature in VMX MSRs but > later enable the feature in CPUID for L1. And so disabling features is likely > safe-ish, but enabling feature most definitely can cause problems for userspace. > > Hrm, actually, there are likely older VMMs that never set VMX MSRs, and so dropping > the "enable features" code might not be safe either. Grr. The obvious solution > would be to add a quirk, but maybe we can avoid a quirk by skipping KVM's > misguided updates if userspace has set the MSR. That should work for a userspace > that deliberately sets the MSR during setup, and for a userspace that blindly > migrates the MSR since the migrated value should already be correct/sane. > Oh. Just saw your new selftest code, and fininally get your point(I hope so...). Thanks! > > BTW, I found my previous understanding of what vmx_adjust_secondary_exec_control() > > currently does was also wrong. It could also be used for EXITING controls. And > > for such flags(e.g., SECONDARY_EXEC_RDRAND_EXITING), values for the nested settings > > (vmx->nested.msrs.secondary_ctls_high) and for the L1 execution controls(*exec_control) > > could be opposite. So the statement: > > "1> For now, what vmx_adjust_secondary_exec_control() does, is to enable/ > > disable a feature in VMX MSR(and nVMX MSR) based on cpuid changes." > > is wrong. > > No, it's correct. The EXITING controls are just inverted feature flags. E.g. if > RDRAND is disabled in CPUID, KVM sets the EXITING control so that KVM intercepts > RDRAND in order to inject #UD. > > [EXIT_REASON_RDRAND] = kvm_handle_invalid_op, > Well, suppose - cpu_has_vmx_rdrand() is true; - meanwhile guest_cpuid_has(vcpu, X86_FEATURE_RDRAND) is false. And then, what vmx_adjust_secondary_exec_control() currently does is: 1> keep the SECONDARY_EXEC_RDRAND_EXITING set in L1 secondary proc- based execution control. 2> and then clear the SECONDARY_EXEC_RDRAND_EXITING in the high bits of IA32_VMX_PROCBASED_CTLS2 MSR for nested by vmx->nested.msrs.secondary_ctls_high &= ~control; That means for L1 VMM, SECONDARY_EXEC_RDRAND_EXITING must be cleared in its(VMCS12's) secondary proc-based VM-execution control, even when rdrand is disabled in L1's and L2's CPUID. I wonder, for native environment, if an instruction is not supported, will the allowed 1-setting for its corresponding exiting feature in IA32_VMX_PROCBASED_CTLS2 MSR be set, or be cleared? Maybe it should be cleared, and executing such instruction in non-root will just get a #UD directly instead of triggering a VM-Exit? Note: I do not think this will cause any problem, just curious if L1 VMM can observe a behavior that's not supposed to be in native scenario( only because what we are doing in KVM). B.R. Yu