On Thu, Feb 13, 2025 at 10:16 AM Leo Yan <leo.yan@xxxxxxx> wrote: > > On Sun, Feb 02, 2025 at 06:43:05PM -0600, Rob Herring (Arm) wrote: > > [...] > > > +void brbe_enable(const struct arm_pmu *arm_pmu) > > +{ > > + struct pmu_hw_events *cpuc = this_cpu_ptr(arm_pmu->hw_events); > > + u64 brbfcr = 0, brbcr = 0; > > + > > + /* > > + * Merge the permitted branch filters of all events. > > + */ > > + for (int i = 0; i < ARMPMU_MAX_HWEVENTS; i++) { > > + struct perf_event *event = cpuc->events[i]; > > + > > + if (event && has_branch_stack(event)) { > > + brbfcr |= event->hw.branch_reg.config; > > + brbcr |= event->hw.extra_reg.config; > > + } > > + } > > + > > + /* > > + * If the record buffer contains any branches, we've already read them > > + * out and don't want to read them again. > > + * No need to sync as we're already stopped. > > + */ > > + brbe_invalidate_nosync(); > > + isb(); // Make sure invalidate takes effect before enabling > > + > > + /* > > + * In VHE mode with MDCR_EL2.HPMN set to PMCR_EL0.N, the counters are > > + * controlled by BRBCR_EL1 rather than BRBCR_EL2 (which writes to > > + * BRBCR_EL1 are redirected to). Use the same value for both register > > + * except keep EL1 and EL0 recording disabled in guests. > > + */ > > + if (is_kernel_in_hyp_mode()) > > + write_sysreg_s(brbcr & ~(BRBCR_ELx_ExBRE | BRBCR_ELx_E0BRE), SYS_BRBCR_EL12); > > + write_sysreg_s(brbcr, SYS_BRBCR_EL1); > > + isb(); // Ensure BRBCR_ELx settings take effect before unpausing > > + > > + write_sysreg_s(brbfcr, SYS_BRBFCR_EL1); > > Seems to me, it is weird that first enable recording (BRBCR), then set > control register BRBFCR. And the writing SYS_BRBFCR_EL1 not guarded > by a barrier is also a bit concerned. We are always disabled (paused) when we enter brbe_enable(). So the last thing we do is unpause. The only ordering we care about after writing SYS_BRBFCR_EL1 is writing PMCR which has an isb before it is written. > > +} > > + > > +void brbe_disable(void) > > +{ > > + /* > > + * No need for synchronization here as synchronization in PMCR write > > + * ensures ordering and in the interrupt handler this is a NOP as > > + * we're already paused. > > + */ > > + write_sysreg_s(BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1); > > Maybe the Arm ARM causes the confusion for the description of the > PAUSED bit, I read it as this bit is a status bit to indicate > branch recording is paused. I agree, but I tested that writing it sets the bit (on FVP). Rule RSRJND says s/w clears the bit to unpause, so it is definitely writeable. While it doesn't say anything explicitly about s/w setting the bit, there is no definition in the Arm ARM of a 'write 0 to clear' only bit while there are W1C and W1S definitions. > > +} > > + > > +static const int brbe_type_to_perf_type_map[BRBINFx_EL1_TYPE_DEBUG_EXIT + 1][2] = { > > + [BRBINFx_EL1_TYPE_DIRECT_UNCOND] = { PERF_BR_UNCOND, 0 }, > > + [BRBINFx_EL1_TYPE_INDIRECT] = { PERF_BR_IND, 0 }, > > + [BRBINFx_EL1_TYPE_DIRECT_LINK] = { PERF_BR_CALL, 0 }, > > + [BRBINFx_EL1_TYPE_INDIRECT_LINK] = { PERF_BR_IND_CALL, 0 }, > > + [BRBINFx_EL1_TYPE_RET] = { PERF_BR_RET, 0 }, > > + [BRBINFx_EL1_TYPE_DIRECT_COND] = { PERF_BR_COND, 0 }, > > + [BRBINFx_EL1_TYPE_CALL] = { PERF_BR_CALL, 0 }, > > + [BRBINFx_EL1_TYPE_ERET] = { PERF_BR_ERET, 0 }, > > + [BRBINFx_EL1_TYPE_IRQ] = { PERF_BR_IRQ, 0 }, > > I saw this table cannot reflect the complete branch type. We might > need to consider to extend the perf branch flags later. > > If the 'new_type' is always zero, it is not necessary to maintain a > array with two items (the second one is always 0). I'm adding the new_type's back in the next version. > > > +}; > > + > > +static void brbe_set_perf_entry_type(struct perf_branch_entry *entry, u64 brbinf) > > +{ > > + int brbe_type = brbinf_get_type(brbinf); > > + > > + if (brbe_type <= BRBINFx_EL1_TYPE_DEBUG_EXIT) { > > + const int *br_type = brbe_type_to_perf_type_map[brbe_type]; > > + > > + entry->type = br_type[0]; > > + entry->new_type = br_type[1]; > > + } > > +} > > + > > +static int brbinf_get_perf_priv(u64 brbinf) > > +{ > > + int brbe_el = brbinf_get_el(brbinf); > > + > > + switch (brbe_el) { > > + case BRBINFx_EL1_EL_EL0: > > + return PERF_BR_PRIV_USER; > > + case BRBINFx_EL1_EL_EL1: > > + return PERF_BR_PRIV_KERNEL; > > + case BRBINFx_EL1_EL_EL2: > > + if (is_kernel_in_hyp_mode()) > > + return PERF_BR_PRIV_KERNEL; > > + return PERF_BR_PRIV_HV; > > + default: > > + pr_warn_once("%d - unknown branch privilege captured\n", brbe_el); > > + return PERF_BR_PRIV_UNKNOWN; > > + } > > +} > > + > > +static void capture_brbe_flags(struct perf_branch_entry *entry, > > + const struct perf_event *event, > > + u64 brbinf) > > +{ > > + brbe_set_perf_entry_type(entry, brbinf); > > + > > + if (!branch_sample_no_cycles(event)) > > + entry->cycles = brbinf_get_cycles(brbinf); > > + > > + if (!branch_sample_no_flags(event)) { > > + /* Mispredict info is available for source only and complete branch records. */ > > + if (!brbe_record_is_target_only(brbinf)) { > > + entry->mispred = brbinf_get_mispredict(brbinf); > > + entry->predicted = !entry->mispred; > > + } > > + > > + /* > > + * Currently TME feature is neither implemented in any hardware > > + * nor it is being supported in the kernel. Just warn here once > > + * if TME related information shows up rather unexpectedly. > > + */ > > + if (brbinf_get_lastfailed(brbinf) || brbinf_get_in_tx(brbinf)) > > + pr_warn_once("Unknown transaction states\n"); > > If the branch is in transaction, we can set: > > entry->in_tx = 1; We actively don't want to support the feature. The comment there is from Mark's feedback on a prior version. > > > + } > > + > > + /* > > + * Branch privilege level is available for target only and complete > > + * branch records. > > + */ > > + if (!brbe_record_is_source_only(brbinf)) > > + entry->priv = brbinf_get_perf_priv(brbinf); > > This logic is not quite right. In theory, if we check with above > condition (!brbe_record_is_source_only(brbinf)), it might be the > case both source and target are not valid. We never get here if the record is not valid. A valid record must have at least 1 address valid. I could merge capture_brbe_flags() into perf_entry_from_brbe_regset(). There's not much reason to have a separate function. Rob