Re: [PATCH 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 22, 2023 at 11:15 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, Sep 22, 2023, Jim Mattson wrote:
> > On Fri, Sep 22, 2023 at 10:21 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > IMO, we should delete the offending kernel code.  I don't see how it provides any
> > > value these days.
> >
> > Sure, but that doesn't help legacy guests.
>
> Heh, IMO they don't need help, their owners just need to be placated ;-)
>
> > > And *if* we want to change something in KVM so that we stop getting coustomer
> > > complaints about a useless bit, just let userspace stuff the bit.
> >
> > We want to make customers happy. That should not even be a question.
>
> Can we really not tell them "this is a benign guest bug, ignore it"?

What is the mechanism for doing that?

> > > I think we should also raise the issue with AMD (Borislav maybe?) and ask/demand
> > > that bits in HWCR that KVM allows to be set are architecturally defined.  It's
> > > totally fine if the value of bit 24 is uarch specific, but the behavior needs to
> > > be something that won't change from processor to processor.
> > >
> > > >       kvm_pmu_refresh(vcpu);
> > > >       vcpu->arch.cr4_guest_rsvd_bits =
> > > >           __cr4_reserved_bits(guest_cpuid_has, vcpu);
> > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > index 3421ed7fcee0..cb02a7c2938b 100644
> > > > --- a/arch/x86/kvm/x86.c
> > > > +++ b/arch/x86/kvm/x86.c
> > > > @@ -3699,12 +3699,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > > >               data &= ~(u64)0x40;     /* ignore flush filter disable */
> > > >               data &= ~(u64)0x100;    /* ignore ignne emulation enable */
> > > >               data &= ~(u64)0x8;      /* ignore TLB cache disable */
> > > > +             data &= ~(u64)0x1000000;/* ignore TscFreqSel */
> > > >
> > > >               /* Handle McStatusWrEn */
> > > >               if (data & ~BIT_ULL(18)) {
> > > >                       kvm_pr_unimpl_wrmsr(vcpu, msr, data);
> > > >                       return 1;
> > > >               }
> > > > +
> > > > +             /*
> > > > +              * When set, TscFreqSel is read-only. Attempts to
> > > > +              * clear it are ignored.
> > > > +              */
> > > > +             data |= vcpu->arch.msr_hwcr & BIT_ULL(24);
> > >
> > >
> > > The bit is read-only from the guest, but KVM needs to let userspace clear the
> > > bit.
> >
> > Why? We don't let userspace clear bit 1 of EFLAGS, which is also a
> > "reads as one" bit.
>
> Because that's architectural behavior, not dependent on FMS, and KVM needs to
> write EFLAGS to have any hope of being useful, i.e. giving ownership of EFLAGS
> to userspace is not a realistic option.

Remind me what "MSR" stands for. :)

> As proposed, if userspace sets CPUID to a magic FMS, and then changes the FMS to
> something else, kvm_vcpu_after_set_cpuid() will not clear the bit and KVM will
> end up wrongly enumerating the bit.  I doubt userspace would ever do that, but
> it's at least possible.
>
> That could be fixed by actively clearing vcpu->arch.msr_hwcr for other FMS values,
> but then KVM would have to be 100% precise on the FMS matching, which would be a
> maintenance nightmare.

What if I did something crude like we do for MSR_IA32_MISC_ENABLE, and
just set the bit at reset regardless of FMS:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cb02a7c2938b..4d7d0de42a9d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12086,6 +12086,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu,
bool init_event)
                vcpu->arch.msr_misc_features_enables = 0;
                vcpu->arch.ia32_misc_enable_msr =
MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL |

MSR_IA32_MISC_ENABLE_BTS_UNAVAIL;
+               vcpu_arch.msr_hwcr = BIT_ULL(24);

                __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP);
                __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true);

> In other words, userspace owns the vCPU model, and for good reasons.  KVM needs
> to allow userspace to define a sane model, but with *very* few exceptions, KVM
> should not try to "help" userspace by stuffing bits.

Okay. What about the IA32_MISC_ENABLE bits above?

> Pretty much everytime KVM tries to help, it causes problems.  E.g. initializing
> perf_capabilities to kvm_caps.supported_perf_cap seems like a good thing, except
> it presents a bogus model if userspace decides to not enumerate a vPMU to the
> guest (Aaron was allegedly going to send a patch for this...).

KVM is nothing if not inconsistent.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux