On Tue, Feb 1, 2022 at 10:03 PM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote: > > Hi Stephane, > > On 02-Feb-22 10:57 AM, Stephane Eranian wrote: > > On Tue, Feb 1, 2022 at 8:29 PM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote: > >> > >> Perf counter may overcount for a list of Retire Based Events. Implement > >> workaround for Zen3 Family 19 Model 00-0F processors as suggested in > >> Revision Guide[1]: > >> > >> To count the non-FP affected PMC events correctly: > >> o Use Core::X86::Msr::PERF_CTL2 to count the events, and > >> o Program Core::X86::Msr::PERF_CTL2[43] to 1b, and > >> o Program Core::X86::Msr::PERF_CTL2[20] to 0b. > >> > >> Above workaround suggests to clear PERF_CTL2[20], but that will disable > >> sampling mode. Given the fact that, there is already a skew between > >> actual counter overflow vs PMI hit, we are anyway not getting accurate > >> count for sampling events. Also, using PMC2 with both bit43 and bit20 > >> set can result in additional issues. Hence Linux implementation of > >> workaround uses non-PMC2 counter for sampling events. > >> > > Something is missing from your description here. If you are not > > clearing bit[20] and > > not setting bit[43], then how does running on CTL2 by itself improve > > the count. Is that > > enough to make the counter count correctly? > > Yes. For counting retire based events, we need PMC2[43] set and > PMC2[20] clear so that it will not overcount. > Ok, I get that part now. You are forcing the bits in the get_constraint() function. > > > > For sampling events, your patch makes CTL2 not available. That seems > > to contradict the > > workaround. Are you doing this to free CTL2 for counting mode events > > instead? If you are > > not using CTL2, then you are not correcting the count. Are you saying > > this is okay in sampling mode > > because of the skid, anyway? > > Correct. The constraint I am placing is to count retire events on > PMC2 and sample retire events on other counters. > Why do you need to permanently exclude CTL2 for retired events given you are forcing the bits in the get_constraints() for counting events config only, i.e., as opposed to in CTL2 itself. If the sampling retired events are unconstrained, they can use any counters. If a counting retired event is added, it has a "stronger" constraints and will be scheduled before the unconstrained events, yield the same behavior you wanted, except on demand which is preferable. > Thanks, > Ravi