Hi, On Wed, Mar 15, 2023 at 12:07:22PM +0100, Eric Auger wrote: > The mem access loop currently features ISB barriers only. However > the mem_access loop counts the number of accesses to memory. ISB > do not garantee the PE cannot reorder memory access. Let's > add a DSB ISH before the write to PMCR_EL0 that enables the PMU > and after the last iteration, before disabling the PMU. > > Signed-off-by: Eric Auger <eric.auger@xxxxxxxxxx> > Suggested-by: Alexandru Elisei <alexandru.elisei@xxxxxxx> > > --- > > This was discussed in https://lore.kernel.org/all/YzxmHpV2rpfaUdWi@monolith.localdoman/ > --- > arm/pmu.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arm/pmu.c b/arm/pmu.c > index b88366a8..dde399e2 100644 > --- a/arm/pmu.c > +++ b/arm/pmu.c > @@ -301,6 +301,7 @@ static void mem_access_loop(void *addr, long loop, uint32_t pmcr) > { > uint64_t pmcr64 = pmcr; > asm volatile( > + " dsb ish\n" I think it might still be possible to reorder memory accesses which are part of the loop after the DSB above and before the PMU is enabled below. But the DSB above is needed to make sure previous memory accesses, which shouldn't be counted as part of the loop, are completed. I would put another DSB after the ISB which enables the PMU, that way all memory accesses are neatly sandwitches between two DSBs. Having 3 DSBs might look like overdoing it, but I reason it to be correct. What do you think? Thanks, Alex > " msr pmcr_el0, %[pmcr]\n" > " isb\n" > " mov x10, %[loop]\n" > @@ -308,6 +309,7 @@ asm volatile( > " ldr x9, [%[addr]]\n" > " cmp x10, #0x0\n" > " b.gt 1b\n" > + " dsb ish\n" > " msr pmcr_el0, xzr\n" > " isb\n" > : > -- > 2.38.1 > >