On Fri, Apr 26, 2024, Weijiang Yang wrote: > On 4/26/2024 2:14 AM, Sean Christopherson wrote: > > Extend KVM's suppression of failures due to a userspace access to an > > unsupported, but advertised as a "to save" MSR to all MSRs, not just those > > that happen to reach the default case statements in kvm_get_msr_common() > > and kvm_set_msr_common(). KVM's soon-to-be-established ABI is that if an > > MSR is advertised to userspace, then userspace is allowed to read the MSR, > > and write back the value that was read, i.e. why an MSR is unsupported > > doesn't change KVM's ABI. > > > > Practically speaking, this is very nearly a nop, as the only other paths > > that return KVM_MSR_RET_UNSUPPORTED are {svm,vmx}_get_feature_msr(), and > > it's unlikely, though not impossible, that userspace is using KVM_GET_MSRS > > on unsupported MSRs. > > > > The primary goal of moving the suppression to common code is to allow > > returning KVM_MSR_RET_UNSUPPORTED as appropriate throughout KVM, without > > having to manually handle the "is userspace accessing an advertised" > > waiver. I.e. this will allow formalizing KVM's ABI without incurring a > > high maintenance cost. > > > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> > > --- > > arch/x86/kvm/x86.c | 27 +++++++++------------------ > > 1 file changed, 9 insertions(+), 18 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 04a5ae853774..4c91189342ff 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -527,6 +527,15 @@ static __always_inline int kvm_do_msr_access(struct kvm_vcpu *vcpu, u32 msr, > > if (ret != KVM_MSR_RET_UNSUPPORTED) > > return ret; > > + /* > > + * Userspace is allowed to read MSRs, and write '0' to MSRs, that KVM > > + * reports as to-be-saved, even if an MSR isn't fully supported. > > + * Simply check that @data is '0', which covers both the write '0' case > > + * and all reads (in which case @data is zeroed on failure; see above). > > + */ > > + if (host_initiated && !*data && kvm_is_msr_to_save(msr)) > > + return 0; > > + > > IMHO, it's worth to document above phrase into virt/kvm/api.rst KVM_{GET, > SET}_MSRS sections as a note because when users space reads/writes MSRs > successfully, it doesn't necessarily mean the operation really took effect. > Maybe it's just due to the fact they're exposed in "to-be-saved" list. Agreed, though I think I'd prefer to wait to officially document the behavior until we have fully converted KVM's internals to KVM_MSR_RET_UNSUPPORTED. I'm 99% certain the behavior won't actually be as simple as "userspace can write '0'", e.g. I know of at least one case where KVM allows '0' _or_ the KVM's non-zero default. In the case that I'm aware of (MSR_AMD64_TSC_RATIO), the "default" is a hardcoded KVM constant, e.g. won't change based on underlying hardware, but there me be other special cases lurking, and we won't know until we complete the conversion :-(