Hi Stephane, On 02-Feb-22 11:46 AM, Stephane Eranian wrote: > On Tue, Feb 1, 2022 at 10:03 PM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote: >> >> Hi Stephane, >> >> On 02-Feb-22 10:57 AM, Stephane Eranian wrote: >>> On Tue, Feb 1, 2022 at 8:29 PM Ravi Bangoria <ravi.bangoria@xxxxxxx> wrote: >>>> >>>> Perf counter may overcount for a list of Retire Based Events. Implement >>>> workaround for Zen3 Family 19 Model 00-0F processors as suggested in >>>> Revision Guide[1]: >>>> >>>> To count the non-FP affected PMC events correctly: >>>> o Use Core::X86::Msr::PERF_CTL2 to count the events, and >>>> o Program Core::X86::Msr::PERF_CTL2[43] to 1b, and >>>> o Program Core::X86::Msr::PERF_CTL2[20] to 0b. >>>> >>>> Above workaround suggests to clear PERF_CTL2[20], but that will disable >>>> sampling mode. Given the fact that, there is already a skew between >>>> actual counter overflow vs PMI hit, we are anyway not getting accurate >>>> count for sampling events. Also, using PMC2 with both bit43 and bit20 >>>> set can result in additional issues. Hence Linux implementation of >>>> workaround uses non-PMC2 counter for sampling events. >>>> >>> Something is missing from your description here. If you are not >>> clearing bit[20] and >>> not setting bit[43], then how does running on CTL2 by itself improve >>> the count. Is that >>> enough to make the counter count correctly? >> >> Yes. For counting retire based events, we need PMC2[43] set and >> PMC2[20] clear so that it will not overcount. >> > Ok, I get that part now. You are forcing the bits in the > get_constraint() function. > >>> >>> For sampling events, your patch makes CTL2 not available. That seems >>> to contradict the >>> workaround. Are you doing this to free CTL2 for counting mode events >>> instead? If you are >>> not using CTL2, then you are not correcting the count. Are you saying >>> this is okay in sampling mode >>> because of the skid, anyway? >> >> Correct. The constraint I am placing is to count retire events on >> PMC2 and sample retire events on other counters. >> > Why do you need to permanently exclude CTL2 for retired events given > you are forcing the bits > in the get_constraints() for counting events config only, i.e., as > opposed to in CTL2 itself. > If the sampling retired events are unconstrained, they can use any > counters. If a counting retired > event is added, it has a "stronger" constraints and will be scheduled > before the unconstrained events, > yield the same behavior you wanted, except on demand which is preferable. Got it. Let me respin. Thanks, Ravi