On Wed, Feb 9, 2022 at 1:00 AM Like Xu <like.xu.linux@xxxxxxxxx> wrote: > > On 5/2/2022 9:55 am, Jim Mattson wrote: > >> +static unsigned int amd_pmc_perf_hw_id(struct kvm_pmc *pmc) > >> { > >> + u8 event_select = pmc->eventsel & ARCH_PERFMON_EVENTSEL_EVENT; > > On AMD, the event select is 12 bits. > > Out of your carefulness, we already know this fact. > > This function to get the perf_hw_id by the last 16 bits still works because we > currently > do not have a 12-bits-select event defined in the amd_event_mapping[]. The > 12-bits-select > events (if any) will be programed in the type of PERF_TYPE_RAW. I beg to differ. It doesn't matter whether there are 12-bit event selects in amd_event_mapping[] or not. The fundamental problem is that the equality operation on event selects is broken, because it ignores the high 4 bits. Hence, we may actually find an entry in that table that we *think* is for the requested event, but instead it's for some other event with 0 in the high 4 bits. For example, if the guest requests event 0x1d0 (retired fused instructions), they will get event 0xd0 instead. According to amd_event_mapping, event 0xd0 is " PERF_COUNT_HW_STALLED_CYCLES_FRONTEND." However, according to the Milan PPR, event 0xd0 doesn't exist. So, I don't actually know what we're counting. At the very least, we need a patch like the following (which I fully expect gmail to mangle): --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -210,7 +210,8 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel) if (!allow_event) return; - if (!(eventsel & (ARCH_PERFMON_EVENTSEL_EDGE | + if (!(eventsel & ((0xFULL << 32) | + ARCH_PERFMON_EVENTSEL_EDGE | ARCH_PERFMON_EVENTSEL_INV | ARCH_PERFMON_EVENTSEL_CMASK | HSW_IN_TX | By the way, the following events from amd_event_mapping[] are not listed in the Milan PPR: { 0x7d, 0x07, PERF_COUNT_HW_CACHE_REFERENCES } { 0x7e, 0x07, PERF_COUNT_HW_CACHE_MISSES } { 0xd0, 0x00, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND } { 0xd1, 0x00, PERF_COUNT_HW_STALLED_CYCLES_BACKEND } Perhaps we should build a table based on amd_f17h_perfmon_event_map[] for newer AMD processors?