Hi, On Wed, May 31, 2023 at 10:14:36PM +0200, Eric Auger wrote: > In some ARM ARM ddi0487 revisions it is said that > disabling/enabling a pair of counters that are paired > by a CHAIN event should follow a given sequence: > > Enable the high counter first, isb, enable low counter > Disable the low counter first, isb, disable the high counter > > This was the case in Fc. However this is not written anymore > in subsequent revions such as Ia. > > Anyway, just in case, and because it also makes the code a > little bit simpler, introduce 2 helpers to enable/disable chain > counters that execute those sequences and replace the existing > PMCNTENCLR/ENSET calls (at least this cannot do any harm). > > Also fix 2 write_sysreg_s(0x0, PMCNTENSET_EL0) in subtest 5 & 6 > and replace them by PMCNTENCLR writes since writing 0 in > PMCNTENSET_EL0 has no effect. > > Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx> > > --- > > v1 -> v2: > - fix the enable_chain_counter()/disable_chain_counter() > sequence, ie. swap n + 1 / n as reported by Alexandru. > - fix an other comment using the 'low' terminology > --- > arm/pmu.c | 37 ++++++++++++++++++++++++++++--------- > 1 file changed, 28 insertions(+), 9 deletions(-) > > diff --git a/arm/pmu.c b/arm/pmu.c > index 74dd4c10..74c9f6f9 100644 > --- a/arm/pmu.c > +++ b/arm/pmu.c > @@ -731,6 +731,22 @@ static void test_chained_sw_incr(bool unused) > read_regn_el0(pmevcntr, 0), \ > read_sysreg(pmovsclr_el0)) > > +static void enable_chain_counter(int even) > +{ > + write_sysreg_s(BIT(even + 1), PMCNTENSET_EL0); /* Enable the high counter first */ > + isb(); > + write_sysreg_s(BIT(even), PMCNTENSET_EL0); /* Enable the low counter */ > + isb(); > +} > + > +static void disable_chain_counter(int even) > +{ > + write_sysreg_s(BIT(even), PMCNTENCLR_EL0); /* Disable the low counter first*/ > + isb(); > + write_sysreg_s(BIT(even + 1), PMCNTENCLR_EL0); /* Disable the high counter */ > + isb(); > +} > + > static void test_chain_promotion(bool unused) Here's what test_chain_promotion() does for the first subtest: static void test_chain_promotion(bool unused) { uint32_t events[] = {MEM_ACCESS, CHAIN}; void *addr = malloc(PAGE_SIZE); if (!satisfy_prerequisites(events, ARRAY_SIZE(events))) return; /* Only enable CHAIN counter */ report_prefix_push("subtest1"); pmu_reset(); write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0); write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); write_sysreg_s(0x2, PMCNTENSET_EL0); isb(); mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); And here's what test_chained_counters() does: static void test_chained_counters(bool unused) { uint32_t events[] = {CPU_CYCLES, CHAIN}; uint64_t all_set = pmevcntr_mask(); if (!satisfy_prerequisites(events, ARRAY_SIZE(events))) return; pmu_reset(); write_regn_el0(pmevtyper, 0, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0); write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); /* enable counters #0 and #1 */ write_sysreg_s(0x3, PMCNTENSET_EL0); write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32); precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E); Why the extra ISB in test_chain_promotion()? Or, if you want to look at it the other way around, is the ISB missing from test_chained_counters()? > { > uint32_t events[] = {MEM_ACCESS, CHAIN}; > @@ -769,16 +785,17 @@ static void test_chain_promotion(bool unused) > /* 1st COUNT with CHAIN enabled, next COUNT with CHAIN disabled */ > report_prefix_push("subtest3"); > pmu_reset(); > - write_sysreg_s(0x3, PMCNTENSET_EL0); > write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32); > - isb(); > + enable_chain_counter(0); > PRINT_REGS("init"); Here's how subtest3 ends up looking: report_prefix_push("subtest3"); pmu_reset(); write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32); enable_chain_counter(0); PRINT_REGS("init"); mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); And here's something similar from test_chained_counters(): pmu_reset(); write_sysreg_s(0x3, PMCNTENSET_EL0); write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32); write_regn_el0(pmevcntr, 1, 0x1); precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E); Why does test_chain_promotion() use enable_chain_counter() and test_chained_counters() doesn't? Could probably find more examples of this in test_chain_promotion(). As an aside, it's extremely difficult to figure out how the counters are programmed for a subtest. In the example above, you need to go back 2 subtests, to the start of test_chain_promotion(), to figure that out. And that only gets worse the subtest number increases. test_chain_promotion() would really benefit from being split into separate functions, each with each own clear initial state. But that's for another patch, not for this series. Thanks, Alex > > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > PRINT_REGS("After 1st loop"); > > /* disable the CHAIN event */ > - write_sysreg_s(0x2, PMCNTENCLR_EL0); > + disable_chain_counter(0); > + write_sysreg_s(0x1, PMCNTENSET_EL0); /* Enable the low counter */ > + isb(); > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > PRINT_REGS("After 2nd loop"); > report(read_sysreg(pmovsclr_el0) == 0x1, > @@ -799,9 +816,11 @@ static void test_chain_promotion(bool unused) > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > PRINT_REGS("After 1st loop"); > > - /* enable the CHAIN event */ > - write_sysreg_s(0x3, PMCNTENSET_EL0); > + /* Disable the low counter first and enable the chain counter */ > + write_sysreg_s(0x1, PMCNTENCLR_EL0); > isb(); > + enable_chain_counter(0); > + > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > > PRINT_REGS("After 2nd loop"); > @@ -825,10 +844,10 @@ static void test_chain_promotion(bool unused) > PRINT_REGS("After 1st loop"); > > /* 0 becomes CHAINED */ > - write_sysreg_s(0x0, PMCNTENSET_EL0); > + write_sysreg_s(0x3, PMCNTENCLR_EL0); > write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); > - write_sysreg_s(0x3, PMCNTENSET_EL0); > write_regn_el0(pmevcntr, 1, 0x0); > + enable_chain_counter(0); > > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > PRINT_REGS("After 2nd loop"); > @@ -844,13 +863,13 @@ static void test_chain_promotion(bool unused) > write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0); > write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); > write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32); > - write_sysreg_s(0x3, PMCNTENSET_EL0); > + enable_chain_counter(0); > PRINT_REGS("init"); > > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > PRINT_REGS("After 1st loop"); > > - write_sysreg_s(0x0, PMCNTENSET_EL0); > + disable_chain_counter(0); > write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0); > write_sysreg_s(0x3, PMCNTENSET_EL0); > > -- > 2.38.1 >