On 03-May-24 5:21 AM, Sean Christopherson wrote: > On Tue, Apr 16, 2024, Ravi Bangoria wrote: >> Currently, LBR Virtualization is dynamically enabled and disabled for >> a vcpu by intercepting writes to MSR_IA32_DEBUGCTLMSR. This helps by >> avoiding unnecessary save/restore of LBR MSRs when nobody is using it >> in the guest. However, SEV-ES guest mandates LBR Virtualization to be >> _always_ ON[1] and thus this dynamic toggling doesn't work for SEV-ES >> guest, in fact it results into fatal error: >> >> SEV-ES guest on Zen3, kvm-amd.ko loaded with lbrv=1 >> >> [guest ~]# wrmsr 0x1d9 0x4 >> KVM: entry failed, hardware error 0xffffffff >> EAX=00000004 EBX=00000000 ECX=000001d9 EDX=00000000 >> ... >> >> Fix this by never intercepting MSR_IA32_DEBUGCTLMSR for SEV-ES guests. > > Uh, what? I mean, sure, it works, maybe, I dunno. But there's a _massive_ > disconnect between the first paragraph and this statement. > > Oh, good gravy, it "works" because SEV already forces LBR virtualization. > > svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; > > (a) the changelog needs to call that out. Sorry, I should have called that out explicitly. > (b) KVM needs to disallow SEV-ES if > LBR virtualization is disabled by the admin, i.e. if lbrv=false. That's what I initially thought. But since KVM currently allows booting SEV-ES guests even when lbrv=0 (by silently ignoring lbrv value), erroring out would be a behavior change. > Alternatively, I would be a-ok simply deleting lbrv, e.g. to avoid yet more > printks about why SEV-ES couldn't be enabled. > > Hmm, I'd probably be more than ok. Because AMD (thankfully, blessedly) uses CPUID > bits for SVM features, the admin can disable LBRV via clear_cpuid (or whatever it's > called now). And there are hardly any checks on the feature, so it's not like > having a boolean saves anything. AMD is clearly committed to making sure LBRV > works, so the odds of KVM really getting much value out of a module param is low. Currently, lbrv is not enabled by default with model specific -cpu profiles in qemu. So I guess this is not backward compatible? > And then when you delete lbrv, please add a WARN_ON_ONCE() sanity check in > sev_hardware_setup() (if SEV-ES is supported), because like the DECODEASSISTS > and FLUSHBYASID requirements, it's not super obvious that LBRV is a hard > requirement for SEV-ES (that's an understatment; I'm curious how some decided > that LBR virtualization is where the line go drawn for "yeah, _this_ is mandatory"). I'm not sure. Some ES internal dependency. In any case, the patch simply fixes 'missed clearing MSR Interception' for SEV-ES guests. So, would it be okay to apply this patch as is and do lbrv cleanup as a followup series? PS: AMD Bus Lock Detect virtualization also dependents on LBR Virtualization: https://lore.kernel.org/r/20240429060643.211-4-ravi.bangoria@xxxxxxx Thanks for the review, Ravi