Re: [PATCH v3 2/3] KVM: x86: Virtualize HWCR.TscFreqSel[bit 24]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 28, 2023 at 5:56 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Thu, Sep 28, 2023, Jim Mattson wrote:
> > On certain CPUs, Linux guests expect HWCR.TscFreqSel[bit 24] to be
> > set. If it isn't set, they complain:
> >       [Firmware Bug]: TSC doesn't count with P0 frequency!
> >
> > Allow userspace to set this bit in the virtual HWCR to eliminate the
> > above complaint.
> >
> > Attempts to clear this bit from within the guest are ignored, to match
> > the behavior of modern AMD processors.
> >
> > Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/x86.c | 19 +++++++++++++++++--
> >  1 file changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 1a323cae219c..9209fc0d1a51 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -3700,11 +3700,26 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >               data &= ~(u64)0x100;    /* ignore ignne emulation enable */
> >               data &= ~(u64)0x8;      /* ignore TLB cache disable */
> >
> > -             /* Handle McStatusWrEn */
> > -             if (data & ~BIT_ULL(18)) {
> > +             /*
> > +              * Ignore guest attempts to set TscFreqSel.
> > +              */
>
> No need for a multi-line comment.
> /
> > +             if (!msr_info->host_initiated)
> > +                     data &= ~BIT_ULL(24);
>
> There's no need to clear this before the check below.  The (arguably useless)
> print will show the "supported" bit, but I can't imagine anyone will care.
>
> > +
> > +             /*
> > +              * Allow McStatusWrEn and (from the host) TscFreqSel.
>
> This is unnecessarily confusing IMO, just state that writes to TscFreqSel are
> architecturally ignored.  This would also be an opportune time to explain why
> KVM allows this stupidity...
>
> > +              */
> > +             if (data & ~(BIT_ULL(18) | BIT_ULL(24))) {
> >                       kvm_pr_unimpl_wrmsr(vcpu, msr, data);
> >                       return 1;
> >               }
> > +
> > +             /*
> > +              * TscFreqSel is read-only from within the
> > +              * guest. Attempts to clear it are ignored.
>
> Overly aggressive wrapping.
>
> How about this?
>
> ---
>  arch/x86/kvm/x86.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 791a644dd481..4dd64d359142 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3700,11 +3700,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>                 data &= ~(u64)0x100;    /* ignore ignne emulation enable */
>                 data &= ~(u64)0x8;      /* ignore TLB cache disable */
>
> -               /* Handle McStatusWrEn */
> -               if (data & ~BIT_ULL(18)) {
> +               /*
> +                * Allow McStatusWrEn and TscFreqSel, some guests whine if they
> +                * aren't set.
> +                */

The whining is only about TscFreqSel. KVM actually supports the
functionality of McStatusWrEn (i.e. allows the guest to write to the
MCi_STATUS MSRs).

How about...

/*
* Allow McStatusWrEn and TscFreqSel. (Linux guests from v3.2 through
* at least v6.6 whine if TscFreqSel is clear, depending on F/M/S.
*/

> +               if (data & ~(BIT_ULL(18) | BIT_ULL(24))) {
>                         kvm_pr_unimpl_wrmsr(vcpu, msr, data);
>                         return 1;
>                 }
> +
> +               /* TscFreqSel is architecturally read-only, writes are ignored */

This isn't true. TscFreqSel is not architectural at all. On Family
10h, per https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/31116.pdf,
it was R/W and powered on as 0. In Family 15h, one of the "changes
relative to Family 10H Revision D processors," per
https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/42301_15h_Mod_00h-0Fh_BKDG.pdf,
was:

• MSRC001_0015 [Hardware Configuration (HWCR)]:
  • Dropped TscFreqSel; TSC can no longer be selected to run at NB P0-state.

Despite the "Dropped" above, that same document later describes
HWCR[bit 24] as follows:

TscFreqSel: TSC frequency select. Read-only. Reset: 1. 1=The TSC
increments at the P0 frequency

So, perhaps this block of code can just be dropped? Who really cares
if the guest changes the value (unless it goes on to kexec a new
kernel, which will whine about the bit being clear)?

> +               if (!msr_info->host_initiated)
> +                       data = ~(data & BIT_ULL(24)) |
> +                              (vcpu->arch.msr_hwcr & BIT_ULL(24));
>                 vcpu->arch.msr_hwcr = data;
>                 break;
>         case MSR_FAM10H_MMIO_CONF_BASE:
>
> base-commit: 831ee29a0d4a2219d30268f9fc577217d222e339
> --
>




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux