Hi Alexandru, On 6/16/23 12:50, Alexandru Elisei wrote: > Hi, > > On Wed, May 31, 2023 at 10:14:36PM +0200, Eric Auger wrote: >> In some ARM ARM ddi0487 revisions it is said that >> disabling/enabling a pair of counters that are paired >> by a CHAIN event should follow a given sequence: >> >> Enable the high counter first, isb, enable low counter >> Disable the low counter first, isb, disable the high counter >> >> This was the case in Fc. However this is not written anymore >> in subsequent revions such as Ia. >> >> Anyway, just in case, and because it also makes the code a >> little bit simpler, introduce 2 helpers to enable/disable chain >> counters that execute those sequences and replace the existing >> PMCNTENCLR/ENSET calls (at least this cannot do any harm). >> >> Also fix 2 write_sysreg_s(0x0, PMCNTENSET_EL0) in subtest 5 & 6 >> and replace them by PMCNTENCLR writes since writing 0 in >> PMCNTENSET_EL0 has no effect. >> >> Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx> >> >> --- >> >> v1 -> v2: >> - fix the enable_chain_counter()/disable_chain_counter() >> sequence, ie. swap n + 1 / n as reported by Alexandru. >> - fix an other comment using the 'low' terminology >> --- >> arm/pmu.c | 37 ++++++++++++++++++++++++++++--------- >> 1 file changed, 28 insertions(+), 9 deletions(-) >> >> diff --git a/arm/pmu.c b/arm/pmu.c >> index 74dd4c10..74c9f6f9 100644 >> --- a/arm/pmu.c >> +++ b/arm/pmu.c >> @@ -731,6 +731,22 @@ static void test_chained_sw_incr(bool unused) >> read_regn_el0(pmevcntr, 0), \ >> read_sysreg(pmovsclr_el0)) >> >> +static void enable_chain_counter(int even) >> +{ >> + write_sysreg_s(BIT(even + 1), PMCNTENSET_EL0); /* Enable the high counter first */ >> + isb(); >> + write_sysreg_s(BIT(even), PMCNTENSET_EL0); /* Enable the low counter */ >> + isb(); >> +} >> + >> +static void disable_chain_counter(int even) >> +{ >> + write_sysreg_s(BIT(even), PMCNTENCLR_EL0); /* Disable the low counter first*/ >> + isb(); >> + write_sysreg_s(BIT(even + 1), PMCNTENCLR_EL0); /* Disable the high counter */ >> + isb(); >> +} >> + >> static void test_chain_promotion(bool unused) > Here's what test_chain_promotion() does for the first subtest: > > static void test_chain_promotion(bool unused) > { > uint32_t events[] = {MEM_ACCESS, CHAIN}; > void *addr = malloc(PAGE_SIZE); > > if (!satisfy_prerequisites(events, ARRAY_SIZE(events))) > return; > > /* Only enable CHAIN counter */ > report_prefix_push("subtest1"); > pmu_reset(); > write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0); > write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); > write_sysreg_s(0x2, PMCNTENSET_EL0); > isb(); > > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > > And here's what test_chained_counters() does: > > static void test_chained_counters(bool unused) > { > uint32_t events[] = {CPU_CYCLES, CHAIN}; > uint64_t all_set = pmevcntr_mask(); > > if (!satisfy_prerequisites(events, ARRAY_SIZE(events))) > return; > > pmu_reset(); > > write_regn_el0(pmevtyper, 0, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0); > write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); > /* enable counters #0 and #1 */ > write_sysreg_s(0x3, PMCNTENSET_EL0); > write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32); > > precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E); > > Why the extra ISB in test_chain_promotion()? Or, if you want to look at it > the other way around, is the ISB missing from test_chained_counters()? agreed. as mentionned below, enable_chain_counter() can also be used here and this issues an isb(). > >> { >> uint32_t events[] = {MEM_ACCESS, CHAIN}; >> @@ -769,16 +785,17 @@ static void test_chain_promotion(bool unused) >> /* 1st COUNT with CHAIN enabled, next COUNT with CHAIN disabled */ >> report_prefix_push("subtest3"); >> pmu_reset(); >> - write_sysreg_s(0x3, PMCNTENSET_EL0); >> write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32); >> - isb(); >> + enable_chain_counter(0); >> PRINT_REGS("init"); > Here's how subtest3 ends up looking: > > report_prefix_push("subtest3"); > pmu_reset(); > write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32); > enable_chain_counter(0); > PRINT_REGS("init"); > > mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); > > And here's something similar from test_chained_counters(): > > pmu_reset(); > write_sysreg_s(0x3, PMCNTENSET_EL0); > > write_regn_el0(pmevcntr, 0, PRE_OVERFLOW_32); > write_regn_el0(pmevcntr, 1, 0x1); > precise_instrs_loop(22, pmu.pmcr_ro | PMU_PMCR_E); > > > Why does test_chain_promotion() use enable_chain_counter() and > test_chained_counters() doesn't? > > Could probably find more examples of this in test_chain_promotion(). agreed. I used enable_chain_counter() in test_chain_promotion() and test_chained_sw_incr() whenever possible. In other case, both counters are not set in 'chained' mode. > > As an aside, it's extremely difficult to figure out how the counters are > programmed for a subtest. In the example above, you need to go back 2 > subtests, to the start of test_chain_promotion(), to figure that out. And > that only gets worse the subtest number increases. test_chain_promotion() > would really benefit from being split into separate functions, each with > each own clear initial state. But that's for another patch, not for this > series. Yes I do agree with you. This subset splitting was not the best idea ever. That can be reworked separately indeed. Thank you for your time! Eric > > Thanks, > Alex > >> >> mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); >> PRINT_REGS("After 1st loop"); >> >> /* disable the CHAIN event */ >> - write_sysreg_s(0x2, PMCNTENCLR_EL0); >> + disable_chain_counter(0); >> + write_sysreg_s(0x1, PMCNTENSET_EL0); /* Enable the low counter */ >> + isb(); >> mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); >> PRINT_REGS("After 2nd loop"); >> report(read_sysreg(pmovsclr_el0) == 0x1, >> @@ -799,9 +816,11 @@ static void test_chain_promotion(bool unused) >> mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); >> PRINT_REGS("After 1st loop"); >> >> - /* enable the CHAIN event */ >> - write_sysreg_s(0x3, PMCNTENSET_EL0); >> + /* Disable the low counter first and enable the chain counter */ >> + write_sysreg_s(0x1, PMCNTENCLR_EL0); >> isb(); >> + enable_chain_counter(0); >> + >> mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); >> >> PRINT_REGS("After 2nd loop"); >> @@ -825,10 +844,10 @@ static void test_chain_promotion(bool unused) >> PRINT_REGS("After 1st loop"); >> >> /* 0 becomes CHAINED */ >> - write_sysreg_s(0x0, PMCNTENSET_EL0); >> + write_sysreg_s(0x3, PMCNTENCLR_EL0); >> write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); >> - write_sysreg_s(0x3, PMCNTENSET_EL0); >> write_regn_el0(pmevcntr, 1, 0x0); >> + enable_chain_counter(0); >> >> mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); >> PRINT_REGS("After 2nd loop"); >> @@ -844,13 +863,13 @@ static void test_chain_promotion(bool unused) >> write_regn_el0(pmevtyper, 0, MEM_ACCESS | PMEVTYPER_EXCLUDE_EL0); >> write_regn_el0(pmevtyper, 1, CHAIN | PMEVTYPER_EXCLUDE_EL0); >> write_regn_el0(pmevcntr, 0, PRE_OVERFLOW2_32); >> - write_sysreg_s(0x3, PMCNTENSET_EL0); >> + enable_chain_counter(0); >> PRINT_REGS("init"); >> >> mem_access_loop(addr, COUNT, pmu.pmcr_ro | PMU_PMCR_E); >> PRINT_REGS("After 1st loop"); >> >> - write_sysreg_s(0x0, PMCNTENSET_EL0); >> + disable_chain_counter(0); >> write_regn_el0(pmevtyper, 1, CPU_CYCLES | PMEVTYPER_EXCLUDE_EL0); >> write_sysreg_s(0x3, PMCNTENSET_EL0); >> >> -- >> 2.38.1 >>