On Thu, Nov 10, 2022, Yu Zhang wrote: > > > BTW, I found my previous understanding of what vmx_adjust_secondary_exec_control() > > > currently does was also wrong. It could also be used for EXITING controls. And > > > for such flags(e.g., SECONDARY_EXEC_RDRAND_EXITING), values for the nested settings > > > (vmx->nested.msrs.secondary_ctls_high) and for the L1 execution controls(*exec_control) > > > could be opposite. So the statement: > > > "1> For now, what vmx_adjust_secondary_exec_control() does, is to enable/ > > > disable a feature in VMX MSR(and nVMX MSR) based on cpuid changes." > > > is wrong. > > > > No, it's correct. The EXITING controls are just inverted feature flags. E.g. if > > RDRAND is disabled in CPUID, KVM sets the EXITING control so that KVM intercepts > > RDRAND in order to inject #UD. > > > > [EXIT_REASON_RDRAND] = kvm_handle_invalid_op, > > > > Well, suppose > - cpu_has_vmx_rdrand() is true; > - meanwhile guest_cpuid_has(vcpu, X86_FEATURE_RDRAND) is false. > > And then, what vmx_adjust_secondary_exec_control() currently does is: > 1> keep the SECONDARY_EXEC_RDRAND_EXITING set in L1 secondary proc- > based execution control. > 2> and then clear the SECONDARY_EXEC_RDRAND_EXITING in the high bits > of IA32_VMX_PROCBASED_CTLS2 MSR for nested by > vmx->nested.msrs.secondary_ctls_high &= ~control; > That means for L1 VMM, SECONDARY_EXEC_RDRAND_EXITING must be cleared > in its(VMCS12's) secondary proc-based VM-execution control, even when > rdrand is disabled in L1's and L2's CPUID. Again, it is _userspace's_ responsibility to provide a sane, consistent CPU model to the guest. > I wonder, for native environment, if an instruction is not supported, > will the allowed 1-setting for its corresponding exiting feature in > IA32_VMX_PROCBASED_CTLS2 MSR be set, or be cleared? Maybe it should > be cleared, and executing such instruction in non-root will just get > a #UD directly instead of triggering a VM-Exit? For any reasonable interpretation of the SDM, it's a moot point. The SDM doesn't call out these scenarios for instructions like RDTSCP because they're nonsensical, but for other instructions that can be trapped by the hypervisor and can take a #UD when they're supported, the #UD is prioritized of the VM-Exit. E.g. VMX instructions have pseudocode like: IF not in VMX operation THEN #UD; ELSIF in VMX non-root operation THEN VM exit; In other words, if the CPU doesn't recognize an instruction, it will generate a #UD without getting to the (presumed) microcode flow that checks for VM-Exit.