On Mon, Feb 24, 2025 at 06:03:01PM +0000, Leo Yan wrote: > On Mon, Feb 24, 2025 at 04:05:43PM +0000, Mark Rutland wrote: > > [...] > > > > > > Just a minor concern. I don't see any handling for task migration. > > > > > E.g., for a task is migrated from one CPU to another CPU, I expect we > > > > > need to save and restore branch records based on BRBE injection. So > > > > > far, the driver simply invalidates all records. > > > > > > > > > > I think this topic is very likely discussed before. If this is the > > > > > case, please ignore my comment. Except this, the code looks good > > > > > to me. > > > > > > > > Not really discussed on the list, but that was present in v18 (though > > > > not functional because .sched_task() hook wasn't actually enabled) and > > > > Mark removed it. His work is here[1].The only comment was: > > > > > > > > Note: saving/restoring at context-switch doesn't interact well with > > > > event rotation (e.g. if filters change) > > > > > > In the brbe_enable() function, it "Merge the permitted branch filters > > > of all events". Based on current implementation, all events share the > > > same branch filter. > > > > Critically, the brbe_enable() function merges the filters of all > > *active* events which have been installed into hardware. It does not > > track all events which can be rotated, and the resulting filter is not > > the same -- it can change as a result of rotation. > > In a perf session has multiple events, and events have different branch > filters, seems to me, a simple way is to return error for this case. FWIW, I'd generally prefer to do that since it avoids a number of horrible edge-cases and gets rid of the need to do SW filtering, which falls somewhere between "tricky" and "not entirely possible". However, that's not what LBR and others do, which is why we went with filter merging. If folk on the tools side are happy with the kernel rejecting conflicting events, then I'd be more than happy to do that. What I don't want is that we start off with that approach and people immediately start to complain that the BRBE driver rejects events that the LBR driver accepts. See the last time this came up: https://lore.kernel.org/linux-arm-kernel/Zli6Ot0TwK3Qy-ee@J2N7QTR9R3/ https://lore.kernel.org/linux-arm-kernel/ZlnKFFwv9612V81h@J2N7QTR9R3/ > I would argue BRBE is an IP for recording branches per CPU wise, it does > not support recording for event wise. Yes, there is a mismatch between the hardware and the perf ABI. > If we can unify branch filter within a perf session, would this be > much easier for handling? Do you mean if the perf tool ensured that all events in a given session had the same filter? From the kernel's PoV there's no such thing as a "perf session", and I'm not sure whether you're suggesting doing that in userspace or withing the kernel. Doing that in the perf tool would certianly make a stronger argument for the kernel taking the "reject conflicting branch filters" option. Doing that within the kernel isn't really possible. > > > When event rotation happens, if without context switch, in theory we > > > should can directly use the branch record (no invalidation, no injection) > > > for all events. > > > > No; that only works in *some* cases, and will produce incorrect results > > in others. > > > > For example, consider filtering. Imagine a PMU with a single counter, > > and two events, where event-A filters for calls-and-returns and event-B > > filters for calls-only. When switching from event-A to event-B, it's > > theoretically possible to keep the existing records around, knowing that > > the returns can be filtered out later. When switching from event-B to > > event-A we cannot keep the existing records, since there are gaps > > whenever a return should have been recorded. > > Seems to me, the problem is not caused by event rotation. We need to > calculate a correct filter in the first place - the BRBE driver should > calculate a superset for all filters of events for a session. Then, > generate branch record based event's specific filter.