On Mon, Feb 24, 2025 at 02:03:17PM +0000, Leo Yan wrote: > On Mon, Feb 24, 2025 at 06:46:35AM -0600, Rob Herring wrote: > > On Mon, Feb 24, 2025 at 6:25 AM Leo Yan <leo.yan@xxxxxxx> wrote: > > > On Tue, Feb 18, 2025 at 02:40:06PM -0600, Rob Herring (Arm) wrote: > > > > > > > > From: Anshuman Khandual <anshuman.khandual@xxxxxxx> > > > > > > [...] > > > > > > > BRBE records are invalidated whenever events are reconfigured, a new > > > > task is scheduled in, or after recording is paused (and the records > > > > have been recorded for the event). The architecture allows branch > > > > records to be invalidated by the PE under implementation defined > > > > conditions. It is expected that these conditions are rare. > > > > > > [...] > > > > > > > +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in) > > > > +{ > > > > + struct arm_pmu *armpmu = *this_cpu_ptr(&cpu_armpmu); > > > > + struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); > > > > + > > > > + if (!hw_events->branch_users) > > > > + return; > > > > + > > > > + if (sched_in) > > > > + brbe_invalidate(); > > > > +} > > > > > > Just a minor concern. I don't see any handling for task migration. > > > E.g., for a task is migrated from one CPU to another CPU, I expect we > > > need to save and restore branch records based on BRBE injection. So > > > far, the driver simply invalidates all records. > > > > > > I think this topic is very likely discussed before. If this is the > > > case, please ignore my comment. Except this, the code looks good > > > to me. > > > > Not really discussed on the list, but that was present in v18 (though > > not functional because .sched_task() hook wasn't actually enabled) and > > Mark removed it. His work is here[1].The only comment was: > > > > Note: saving/restoring at context-switch doesn't interact well with > > event rotation (e.g. if filters change) > > In the brbe_enable() function, it "Merge the permitted branch filters > of all events". Based on current implementation, all events share the > same branch filter. Critically, the brbe_enable() function merges the filters of all *active* events which have been installed into hardware. It does not track all events which can be rotated, and the resulting filter is not the same -- it can change as a result of rotation. > When event rotation happens, if without context switch, in theory we > should can directly use the branch record (no invalidation, no injection) > for all events. No; that only works in *some* cases, and will produce incorrect results in others. For example, consider filtering. Imagine a PMU with a single counter, and two events, where event-A filters for calls-and-returns and event-B filters for calls-only. When switching from event-A to event-B, it's theoretically possible to keep the existing records around, knowing that the returns can be filtered out later. When switching from event-B to event-A we cannot keep the existing records, since there are gaps whenever a return should have been recorded. There are a number of cases of that shape given the set of configurable filters. In theory it's possible to retain those in some cases, but I don't think that the complexity is justified. Similarly, whenever kernel branches are recorded it's necessary to drop the stale branches whenever branch recording is paused, as there's necessarily a blackout period and hence a gap in the records. Do you think that you have a case where losing branches across rotation *really* matters? > For a context-switch case, we need to save and re-inject branch record. > BRBE record sticks to a process context, no matter what events have been > enabled. I had originally wanted to keep per-event records around, but it doesn't work in all cases. One reason events get discarded at context-switch time is that CPU-bound events can sample branches, and would mis-attribute stale userspace branches to the wrong context when switching tasks. There are explicit comments about this in amd_pmu_brs_sched_task() and intel_pmu_lbr_sched_task(). Given we discard records when reprogramming events, we *could* try to preserve events in some cases, but I suspect that as with the rotation case this'll be a lot of complexity for little gain. Note that as we discard events when enabling the PMU, we'd throw some task-bound records away anyway, and practically the gain would be limited to cpu-bound records. Do you have a reason why you think we *must* keep events around? Mark.