Hi Jim, On 03-Feb-22 9:39 AM, Jim Mattson wrote: > On Wed, Feb 2, 2022 at 2:52 AM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote: >> >> Perf counter may overcount for a list of Retire Based Events. Implement >> workaround for Zen3 Family 19 Model 00-0F processors as suggested in >> Revision Guide[1]: >> >> To count the non-FP affected PMC events correctly: >> o Use Core::X86::Msr::PERF_CTL2 to count the events, and >> o Program Core::X86::Msr::PERF_CTL2[43] to 1b, and >> o Program Core::X86::Msr::PERF_CTL2[20] to 0b. >> >> Note that the specified workaround applies only to counting events and >> not to sampling events. Thus sampling event will continue functioning >> as is. >> >> Although the issue exists on all previous Zen revisions, the workaround >> is different and thus not included in this patch. >> >> This patch needs Like's patch[2] to make it work on kvm guest. > > IIUC, this patch along with Like's patch actually breaks PMU > virtualization for a kvm guest. > > Suppose I have some code which counts event 0xC2 [Retired Branch > Instructions] on PMC0 and event 0xC4 [Retired Taken Branch > Instructions] on PMC1. I then divide PMC1 by PMC0 to see what > percentage of my branch instructions are taken. On hardware that > suffers from erratum 1292, both counters may overcount, but if the > inaccuracy is small, then my final result may still be fairly close to > reality. > > With these patches, if I run that same code in a kvm guest, it looks > like one of those events will be counted on PMC2 and the other won't > be counted at all. So, when I calculate the percentage of branch > instructions taken, I either get 0 or infinity. Events get multiplexed internally. See below quick test I ran inside guest. My host is running with my+Like's patch and guest is running with only my patch. $ ./perf stat -e branch-instructions,branch-misses -- ./branch-misses Performance counter stats for './branch-misses': 19,847,153,209 branch-instructions:u (50.03%) 950,410,251 branch-misses:u # 4.79% of all branches (49.97%) $ cat branch-misses.c #include <stdlib.h> int main() { long i = 1000000000; long j = 0; while(i--) { switch(rand() % 20) { case 0: j += 0; break; case 1: j += 1; break; case 2: j += 2; break; case 3: j += 3; break; case 4: j += 4; break; case 5: j += 5; break; case 6: j += 6; break; case 7: j += 7; break; case 8: j += 8; break; case 9: j += 9; break; case 10: j += 10; break; case 11: j += 11; break; case 12: j += 12; break; case 13: j += 13; break; case 14: j += 14; break; case 15: j += 15; break; case 16: j += 16; break; case 17: j += 17; break; case 18: j += 18; break; case 19: j += 19; break; default: j += 20; break; } } return 0; } Thanks, Ravi