On 11/12/2024 15:37, Tom Lendacky wrote: > On 12/10/24 16:43, Sean Christopherson wrote: >> On Tue, Dec 10, 2024, Tom Lendacky wrote: >>> On 12/10/24 14:33, Simon Pilkington wrote: >>>> On 10/12/2024 16:47, Sean Christopherson wrote: >>>>> Can you run with the below to see what bits the guest is trying to set (or clear)? >>>>> We could get the same info via tracepoints, but this will likely be faster/easier. >>>>> >>>>> --- >>>>> arch/x86/kvm/svm/svm.c | 12 +++++++++--- >>>>> 1 file changed, 9 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >>>>> index dd15cc635655..5144d0283c9d 100644 >>>>> --- a/arch/x86/kvm/svm/svm.c >>>>> +++ b/arch/x86/kvm/svm/svm.c >>>>> @@ -3195,11 +3195,14 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) >>>>> case MSR_AMD64_DE_CFG: { >>>>> u64 supported_de_cfg; >>>>> >>>>> - if (svm_get_feature_msr(ecx, &supported_de_cfg)) >>>>> + if (WARN_ON_ONCE(svm_get_feature_msr(ecx, &supported_de_cfg))) >>>>> return 1; >>>>> >>>>> - if (data & ~supported_de_cfg) >>>>> + if (data & ~supported_de_cfg) { >>>>> + pr_warn("DE_CFG supported = %llx, WRMSR = %llx\n", >>>>> + supported_de_cfg, data); >>>>> return 1; >>>>> + } >>>>> >>>>> /* >>>>> * Don't let the guest change the host-programmed value. The >>>>> @@ -3207,8 +3210,11 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) >>>>> * are completely unknown to KVM, and the one bit known to KVM >>>>> * is simply a reflection of hardware capabilities. >>>>> */ >>>>> - if (!msr->host_initiated && data != svm->msr_decfg) >>>>> + if (!msr->host_initiated && data != svm->msr_decfg) { >>>>> + pr_warn("DE_CFG current = %llx, WRMSR = %llx\n", >>>>> + svm->msr_decfg, data); >>>>> return 1; >>>>> + } >>>>> >>>>> svm->msr_decfg = data; >>>>> break; >>>>> >>>>> base-commit: fac04efc5c793dccbd07e2d59af9f90b7fc0dca4 >>>> >>>> Relevant dmesg output with some context below. VM locked up as expected. >>>> >>>> [ 85.834971] vfio-pci 0000:0c:00.0: resetting >>>> [ 85.937573] vfio-pci 0000:0c:00.0: reset done >>>> [ 86.494210] vfio-pci 0000:0c:00.0: resetting >>>> [ 86.494264] vfio-pci 0000:0c:00.1: resetting >>>> [ 86.761442] vfio-pci 0000:0c:00.0: reset done >>>> [ 86.761480] vfio-pci 0000:0c:00.1: reset done >>>> [ 86.762392] vfio-pci 0000:0c:00.0: resetting >>>> [ 86.865462] vfio-pci 0000:0c:00.0: reset done >>>> [ 86.977360] virbr0: port 1(vnet1) entered learning state >>>> [ 88.993052] virbr0: port 1(vnet1) entered forwarding state >>>> [ 88.993057] virbr0: topology change detected, propagating >>>> [ 103.459114] kvm_amd: DE_CFG current = 0, WRMSR = 2 >>>> [ 161.442032] virbr0: port 1(vnet1) entered disabled state // VM shut down >>> >>> That is the MSR_AMD64_DE_CFG_LFENCE_SERIALIZE bit. Yeah, that actually >>> does change the behavior of LFENCE and isn't just a reflection of the >>> hardware. >>> >>> Linux does set that bit on boot, too (if LFENCE always serializing isn't >>> advertised 8000_0021_EAX[2]), so I'm kind of surprised it didn't pop up >>> there. >> >> Linux may be running afoul of this, but it would only become visible if someone >> checked dmesg. Even the "unsafe" MSR accesses in Linux gracefully handle faults >> these days, the only symptom would be a WARN. >> >>> I imagine that the above CPUID bit isn't set, so an attempt is made to >>> set the MSR bit. >> >> Yep. And LFENCE_RDTSC _is_ supported, otherwise the supported_de_cfg check would >> have failed. Which means it's a-ok for the guest to set the bit, i.e. KVM won't >> let the guest incorrectly think it's running on CPU for which LFENCE is serializing. >> >> Unless you (Tom) disagree, I vote to simply drop the offending code, i.e. make >> all supported bits fully writable from the guest. KVM is firmly in the wrong here, >> and I can't think of any reason to disallow the guest from clearing LFENCE_SERIALIZE. >> >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >> index 6a350cee2f6c..5a82ead3bf0f 100644 >> --- a/arch/x86/kvm/svm/svm.c >> +++ b/arch/x86/kvm/svm/svm.c >> @@ -3201,15 +3201,6 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) >> if (data & ~supported_de_cfg) >> return 1; >> >> - /* >> - * Don't let the guest change the host-programmed value. The >> - * MSR is very model specific, i.e. contains multiple bits that >> - * are completely unknown to KVM, and the one bit known to KVM >> - * is simply a reflection of hardware capabilities. >> - */ >> - if (!msr->host_initiated && data != svm->msr_decfg) >> - return 1; >> - > > That works for me. > > Thanks, > Tom > >> svm->msr_decfg = data; >> break; >> } >> Thanks for the prompt response on this Sean & Tom. Regards, Simon