On Fri, Sep 22, 2023, Jim Mattson wrote: > On certain CPUs, Linux guests expect HWCR.TscFreqSel[bit 24] to be > set. If it isn't set, they complain: > [Firmware Bug]: TSC doesn't count with P0 frequency! > > Eliminate this complaint by setting the bit on virtual processors for > which Linux guests expect it to be set. > > Note that this bit is read-only on said processors. > > Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx> > --- > arch/x86/kvm/cpuid.c | 10 ++++++++++ > arch/x86/kvm/x86.c | 7 +++++++ > 2 files changed, 17 insertions(+) > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index 0544e30b4946..2d7dcd13dcc3 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -373,6 +373,16 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); > vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu); > > + /* > + * HWCR.TscFreqSel[bit 24] has a reset value of 1 on some processors. > + */ > + if (guest_cpuid_is_amd_or_hygon(vcpu) && > + guest_cpuid_has(vcpu, X86_FEATURE_CONSTANT_TSC) && > + (guest_cpuid_family(vcpu) > 0x10 || > + (guest_cpuid_family(vcpu) == 0x10 && > + guest_cpuid_model(vcpu) >= 2))) > + vcpu->arch.msr_hwcr |= BIT(24); Oh hell no. It's bad enough that KVM _allows_ setting uarch specific bits, but actively setting bits is a step too far. IMO, we should delete the offending kernel code. I don't see how it provides any value these days. And *if* we want to change something in KVM so that we stop getting coustomer complaints about a useless bit, just let userspace stuff the bit. I think we should also raise the issue with AMD (Borislav maybe?) and ask/demand that bits in HWCR that KVM allows to be set are architecturally defined. It's totally fine if the value of bit 24 is uarch specific, but the behavior needs to be something that won't change from processor to processor. > kvm_pmu_refresh(vcpu); > vcpu->arch.cr4_guest_rsvd_bits = > __cr4_reserved_bits(guest_cpuid_has, vcpu); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 3421ed7fcee0..cb02a7c2938b 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3699,12 +3699,19 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > data &= ~(u64)0x40; /* ignore flush filter disable */ > data &= ~(u64)0x100; /* ignore ignne emulation enable */ > data &= ~(u64)0x8; /* ignore TLB cache disable */ > + data &= ~(u64)0x1000000;/* ignore TscFreqSel */ > > /* Handle McStatusWrEn */ > if (data & ~BIT_ULL(18)) { > kvm_pr_unimpl_wrmsr(vcpu, msr, data); > return 1; > } > + > + /* > + * When set, TscFreqSel is read-only. Attempts to > + * clear it are ignored. > + */ > + data |= vcpu->arch.msr_hwcr & BIT_ULL(24); The bit is read-only from the guest, but KVM needs to let userspace clear the bit. > vcpu->arch.msr_hwcr = data; > break; > case MSR_FAM10H_MMIO_CONF_BASE: > -- > 2.42.0.515.g380fc7ccd1-goog >