Hi Joey, On Thu, Feb 20, 2025 at 02:13:53PM +0000, Joey Gouly wrote: > Count EL2 cycles if that's the EL kvm-unit-tests is running at! > > Signed-off-by: Joey Gouly <joey.gouly@xxxxxxx> > --- > arm/pmu.c | 21 ++++++++++++++++++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/arm/pmu.c b/arm/pmu.c > index 2dc0822b..238e4628 100644 > --- a/arm/pmu.c > +++ b/arm/pmu.c > @@ -206,6 +206,8 @@ static void test_overflow_interrupt(bool overflow_at_64bits) {} > #define ID_DFR0_PMU_V3_8_5 0b0110 > #define ID_DFR0_PMU_IMPDEF 0b1111 > > +#define PMCCFILTR_EL0_NSH BIT(27) > + > static inline uint32_t get_id_aa64dfr0(void) { return read_sysreg(id_aa64dfr0_el1); } > static inline uint32_t get_pmcr(void) { return read_sysreg(pmcr_el0); } > static inline void set_pmcr(uint32_t v) { write_sysreg(v, pmcr_el0); } > @@ -247,7 +249,7 @@ static inline void precise_instrs_loop(int loop, uint32_t pmcr) > #define PMCNTENCLR_EL0 sys_reg(3, 3, 9, 12, 2) > > #define PMEVTYPER_EXCLUDE_EL1 BIT(31) I think you can drop the macro. I was expecting to see an exclude EL2 macro used in its place when the test is running at EL2, but it seems PMEVTYPER_EXCLUDE_EL1 is not used anywhere. Unless I'm missing something here. > -#define PMEVTYPER_EXCLUDE_EL0 BIT(30) > +#define PMEVTYPER_EXCLUDE_EL0 BIT(30) | BIT(27) I'm not really sure what that's supposed to achieve - if the test is running at EL2, and events from both EL0 and EL2 are excluded, what's left to count? I also don't understand what PMEVTYPER_EXCLUDE_EL0 does in the non-nested virt case (when kvm-unit-tests boots at El1). The tests run at EL1, so not counting events at EL0 shouldn't affect anything. Am I missing something obvious here? > > static bool is_event_supported(uint32_t n, bool warn) > { > @@ -1059,11 +1061,18 @@ static void test_overflow_interrupt(bool overflow_at_64bits) > static bool check_cycles_increase(void) > { > bool success = true; > + u64 pmccfiltr = 0; > > /* init before event access, this test only cares about cycle count */ > pmu_reset(); > set_pmcntenset(1 << PMU_CYCLE_IDX); > - set_pmccfiltr(0); /* count cycles in EL0, EL1, but not EL2 */ > + > +#if defined(__aarch64__) > + if (current_level() == CurrentEL_EL2) > + // include EL2 cycle counts > + pmccfiltr |= PMCCFILTR_EL0_NSH; > +#endif > + set_pmccfiltr(pmccfiltr); > > set_pmcr(get_pmcr() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E); > isb(); > @@ -1114,11 +1123,17 @@ static void measure_instrs(int num, uint32_t pmcr) > static bool check_cpi(int cpi) > { > uint32_t pmcr = get_pmcr() | PMU_PMCR_LC | PMU_PMCR_C | PMU_PMCR_E; > + u64 pmccfiltr = 0; > > /* init before event access, this test only cares about cycle count */ > pmu_reset(); > set_pmcntenset(1 << PMU_CYCLE_IDX); > - set_pmccfiltr(0); /* count cycles in EL0, EL1, but not EL2 */ > +#if defined(__aarch64__) > + if (current_level() == CurrentEL_EL2) > + // include EL2 cycle counts > + pmccfiltr |= PMCCFILTR_EL0_NSH; > +#endif It's called twice, so it could be abstracted in a function. Also, I find it interesting that for PMCCFILTR_EL0 you set the NSH bit based on current exception level, but for PMEVTYPER you set it unconditionally. Why the different approaches? For convenience of is there something more? Thanks, Alex