On 9/17/2024 2:11 AM, dongli.zhang@xxxxxxxxxx wrote: > > On 9/16/24 11:54 AM, Maxim Levitsky wrote: >> Hi! >> >> We recently saw a failure in one of the aws VM instances that causes the following error during the guest boot: >> >> 0.480051] unchecked MSR access error: WRMSR to 0xc0000302 (tried to write 0x040000000000001f) at rIP: 0xffffffff96c093e2 (amd_pmu_cpu_reset.constprop.0+0x42/0x80) >> >> >> I investigated the issue and I see that the hypervisor does expose PerfmonV2, but not the LBRv2 support: >> >> # cpuid -1 -l 0x80000022 >> CPU: >> Extended Performance Monitoring and Debugging (0x80000022): >> AMD performance monitoring V2 = true >> AMD LBR V2 = false >> AMD LBR stack & PMC freezing = false >> number of core perf ctrs = 0x5 (5) >> number of LBR stack entries = 0x0 (0) >> number of avail Northbridge perf ctrs = 0x0 (0) >> number of available UMC PMCs = 0x0 (0) >> active UMCs bitmask = 0x0 >> That's expected. LBRv2 is currently not available to KVM guests. However, PerfMonV2 should be the only feature bit required to indicate the availability of MSRs 0xc0000300..0xc0000303 >> I also verified that I can write 0x1f to 0xc0000302 but not 0x040000000000001f: >> >> # wrmsr 0xc0000302 0x1f >> # wrmsr 0xc0000302 0x040000000000001f >> wrmsr: CPU 0 cannot set MSR 0xc0000302 to 0x040000000000001f >> # >> >> The AMD's APM is not clear on what should happen if unsupported bits are attempted to be cleared >> using this MSR. >> >> Also I noticed that amd_pmu_v2_handle_irq writes 0xffffffffffffffff to this msrs. >> It has the following code: >> >> >> WARN_ON(status > 0); >> >> /* Clear overflow and freeze bits */ >> amd_pmu_ack_global_status(~status); >> >> >> This implies that it is OK to set all bits in this MSR. >> It is, but writes to the reserved bits are ignored. > > To share my data point on QEMU+KVM: I am not able to reproduce with the most > recent QEMU (not AWS) + below patch. > > [PATCH v2 2/4] i386/cpu: Add PerfMonV2 feature bit > https://lore.kernel.org/all/69905b486218f8287b9703d1a9001175d04c2f02.1723068946.git.babu.moger@xxxxxxx/ > > Both my VM and KVM are 6.10. > > vm# cpuid -1 -l 0x80000022 > CPU: > Extended Performance Monitoring and Debugging (0x80000022): > AMD performance monitoring V2 = true > AMD LBR V2 = false > AMD LBR stack & PMC freezing = false > number of core perf ctrs = 0x6 (6) > number of LBR stack entries = 0x0 (0) > number of avail Northbridge perf ctrs = 0x0 (0) > number of available UMC PMCs = 0x0 (0) > active UMCs bitmask = 0x0 > > > Both writes are passed. > > vm# wrmsr 0xc0000302 0x1f > vm# wrmsr 0xc0000302 0x040000000000001f > > Here is bcc output. Both writes are good. > > kvm# /usr/share/bcc/tools/trace -t -C 'kvm_pmu_set_msr "%x", retval' > ... ... > 4.748614 19 43545 43550 CPU 0/KVM kvm_pmu_set_msr 0 > 10.97396 19 43545 43550 CPU 0/KVM kvm_pmu_set_msr 0 > Thanks for testing. I cannot replicate this either with an upstream kernel. - Sandipan