Re: [PATCH] KVM: x86/pmu: Return correct value of IA32_PERF_CAPABILITIES for userspace after vCPU has run

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 13, 2024, Wei W Wang wrote:
> On Wednesday, March 13, 2024 8:38 AM, Mingwei Zhang wrote:
> > Return correct value of IA32_PERF_CAPABILITIES when userspace tries to read
> > it after vCPU has already run. Previously, KVM will always return the guest
> > cached value on get_msr() even if guest CPUID lacks X86_FEATURE_PDCM. The
> > guest cached value on default is kvm_caps.supported_perf_cap. However,
> > when userspace sets the value during live migration, the call fails because of
> > the check on X86_FEATURE_PDCM.
> 
> Could you point where in the set_msr path that could fail?
> (I don’t find there is a check of X86_FEATURE_PDCM in vmx_set_msr and
> kvm_set_msr_common)

The changelog is misleading, it's not the PDCM feature bit, it's the PMU version
check in vmx_set_msr():

	case MSR_IA32_PERF_CAPABILITIES:
		if (data && !vcpu_to_pmu(vcpu)->version)
			return 1;

> > Initially, it sounds like a pure userspace issue. It is not. After vCPU has run,
> > KVM should faithfully return correct value to satisify legitimate requests from
> > userspace such as VM suspend/resume and live migrartion. In this case, KVM
> > should return 0 when guest cpuid lacks X86_FEATURE_PDCM. 
> Some typos above (satisfy, migration, CPUID)
> 
> Seems the description here isn’t aligned to your code below?
> The code below prevents userspace from reading the MSR value (not return 0 as the
> read value) in that case.

Ya.

> >So fix the
> > problem by adding an additional check in vmx_set_msr().
> > 
> > Note that IA32_PERF_CAPABILITIES is emulated on AMD side, which is fine
> > because it set_msr() is guarded by kvm_caps.supported_perf_cap which is
> > always 0.
> > 
> > Cc: Aaron Lewis <aaronlewis@xxxxxxxxxx>
> > Cc: Jim Mattson <jmattson@xxxxxxxxxx>
> > Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
> > 40e3780d73ae..6d8667b56091 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -2049,6 +2049,17 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu,
> > struct msr_data *msr_info)
> >  		msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
> >  			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
> >  		break;
> > +	case MSR_IA32_PERF_CAPABILITIES:
> > +		/*
> > +		 * Host VMM should not get potentially invalid MSR value if
> > vCPU
> > +		 * has already run but guest cpuid lacks the support for the
> > +		 * MSR.
> > +		 */
> > +		if (msr_info->host_initiated &&
> > +		    kvm_vcpu_has_run(vcpu) &&
> > +		    !guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
> > +			return 1;

As Wei pointed out, this doesn't match the changelog.  And I don't think this is
what we want, at least not in isolation.  Making KVM more restrictive on userspace
reads doesn't solve the live migration save/restore issue, and the kvm_vcpu_has_run()
adds yet another flavor of MSR handling.

We discussed this whole MSRs mess at PUCK this morning.  I forgot to hit RECORD,
but Paolo took notes and will post them soon.

Going from memory, the plan is to:

  1. Commit to, and document, that userspace must do KVM_SET_CPUID{,2} prior to
     KVM_SET_MSRS.

  2. Go with roughly what I proposed in the CET thread (for unsupported MSRS,
     read 0 and drop writes of '0')[*].

  3. Add a quire for PERF_CAPABILITIES, ARCH_CAPABILITIES, and PLATFORM_INFO
     (if quirk==enabled, keep KVM's current behavior; if quirk==disabled, zero-
      initialize the MSRs).

With those pieces in place, KVM can simply check X86_FEATURE_PDCM for both reads
and writes to PERF_CAPABILITIES, and the common userspace MSR handling will
convert "unsupported" to "success" as appropriate.

[*] https://lore.kernel.org/all/ZfDdS8rtVtyEr0UR@xxxxxxxxxx





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux