Am 20/03/2023 um 15:53 schrieb Sean Christopherson: > On Fri, Mar 17, 2023, Pawan Gupta wrote: >> On Fri, Mar 17, 2023 at 04:14:01PM -0700, Nathan Chancellor wrote: >>> On Fri, Mar 17, 2023 at 03:53:45PM -0700, Pawan Gupta wrote: >>>> On Fri, Mar 17, 2023 at 12:04:32PM -0700, Nathan Chancellor wrote: >>>>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c >>>>>> index c788aa382611..9a78ea96a6d7 100644 >>>>>> --- a/arch/x86/kvm/vmx/vmx.c >>>>>> +++ b/arch/x86/kvm/vmx/vmx.c >>>>>> @@ -2133,6 +2133,39 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated >>>>>> return debugctl; >>>>>> } >>>>>> >>>>>> +static int vmx_set_msr_ia32_cmd(struct kvm_vcpu *vcpu, >>>>>> + struct msr_data *msr_info, >>>>>> + bool guest_has_feat, u64 cmd, >>>>>> + int x86_feature_bit) >>>>>> +{ >>>>>> + if (!msr_info->host_initiated && !guest_has_feat) >>>>>> + return 1; >>>>>> + >>>>>> + if (!(msr_info->data & ~cmd)) >>>> >>>> Looks like this is doing a reverse check. Shouldn't this be as below: >>> >>> That diff on top of next-20230317 appears to resolve the issue for me >>> and my L1 guest can spawn an L2 guest without any issues (which is the >>> extent of my KVM testing). >> >> Great! >> >>> Is this a problem for the SVM version? It has the same check it seems, >>> although I did not have any issues on my AMD test platform (but I guess >>> that means that the system has the support?). >> >> IIUC, SVM version also needs to be fixed. > > Yeah, looks that way. If we do go this route, can you also rename "cmd" to something > like "allowed_mask"? It took me far too long to understand what "cmd" represents. > >>> I assume this will just be squashed into the original change but if not: >> >> Thats what I think, and if its too late to be squashed I will send a >> formal patch. Maintainers? > > Honestly, I'd rather revert the whole mess and try again. The patches obviously > weren't tested, Well... no. They were tested. Call it wrongly tested, badly tested, whatever you want but don't say "obviously weren't tested". I even asked you in a private email why the cpu flag was visible in Linux and not in rhel when using the same machine. So again, my bad with these patches, I sincerely apologize but I would prefer that you think I don't know how to test this stuff rather than say that I carelessly sent something without checking :) About the rest of the email: whatever you decide is fine for me. Thank you, Emanuele and the entire approach (that was borrowed from the existing > MSR_IA32_PRED_CMD code) makes no sense. > > The MSRs are write-only command registers, i.e. don't have a persistent value. > So unlike MSR_IA32_SPEC_CTRL (which I suspect was the source for the copy pasta), > where KVM needs to track the guest value, there are no downsides to disabling > interception of the MSRs. > > Manually checking the value written by the guest or host userspace is similarly > ridiculous. The MSR is being exposed directly to the guest, i.e. after the first > access, the guest can throw any value at bare metal anyways, so just do wrmsrl_safe() > and call it good. > > In other words, the common __kvm_set_msr() switch should have something like so, > > case MSR_IA32_PRED_CMD: > if (!cpu_feature_enabled(X86_FEATURE_IBPB)) > return 1; > > if (!msr_info->host_initiated && > !guest_cpuid_has(vcpu, guest_has_pred_cmd_msr(vcpu))) > return 1; > > ret = !!wrmsrl_safe(MSR_IA32_PRED_CMD, msr_info->data); > break; > case MSR_IA32_FLUSH_CMD: > if (!cpu_feature_enabled(X86_FEATURE_FLUSH_L1D)) > return 1; > > if (!msr_info->host_initiated && > !guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D)) > return 1; > > ret = !!wrmsrl_safe(MSR_IA32_FLUSH_CMD, msr_info->data); > break; > > with the MSR interception handled in e.g. vmx_vcpu_after_set_cpuid(). > > Paolo? >