On Wed, May 15, 2024 at 10:04 PM Ian Rogers <irogers@xxxxxxxxxx> wrote: > > On Wed, May 15, 2024 at 9:20 PM Ian Rogers <irogers@xxxxxxxxxx> wrote: > > > > Allow uid and gid to be terms in BPF filters by first breaking the > > connection between filter terms and PERF_SAMPLE_xx values. Calculate > > the uid and gid using the bpf_get_current_uid_gid helper, rather than > > from a value in the sample. Allow filters to be passed to perf top, this allows: > > > > $ perf top -e cycles:P --filter "uid == $(id -u)" > > > > to work as a "perf top -u" workaround, as "perf top -u" usually fails > > due to processes/threads terminating between the /proc scan and the > > perf_event_open. > > Fwiw, something I noticed playing around with this (my workload was > `perf test -w noploop 100000` as different users) is that old samples > appeared to linger around making terminated processes still appear in > the top list. My guess is that there aren't other samples showing up > and pushing the old sample events out of the ring buffers due to the > filter. This can look quite odd and I don't know if we have a way to > improve upon it, flush the ring buffers, histograms, etc. It appears > to be a latent `perf top` issue that you could encounter on other low > frequency events, but I thought I'd mention it anyway. Some other thoughts: - It is kind of annoying with the --filter option (either on top or record) that there first needs to be an event to filter on. It'd be nice if we could just filter the default event. - Should "perf top --uid=1234" be removed or turned into an alias for '--filter "uid == $(id -u)"' given the --uid option generally doesn't work? - What should happen to the perf top --pid and --tid options, should they be filters? Should they fallback on /proc scanning if there aren't sufficient BPF permissions? The plumbing for that is going to be messy. - There should probably be a way to filter on cgroups. - Does the user care that there are 3 kinds of filter that will work differently? Could we break them apart to make it more explicit, I may want tracepoint events with a BPF filter. How can we ensure 1 syntax for the 3 kinds of filter. - Filtering on register values could be potentially interesting, for example, sampling on memcpy-s where the length is over a threshold. We have a register capture test: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/tests/shell/record.sh#n81 Perhaps the filter could look something like 'perf record -g -e mem:$ADDRESS_OF_MEMCPY:x --filter "reg:rdx > 1024"' - this makes me think we need to make a more convenient way to specify memory addresses as symbols. Thanks, Ian > > > Ian Rogers (3): > > perf bpf filter: Give terms their own enum > > perf bpf filter: Add uid and gid terms > > perf top: Allow filters on events > > > > tools/perf/Documentation/perf-record.txt | 2 +- > > tools/perf/Documentation/perf-top.txt | 4 ++ > > tools/perf/builtin-top.c | 9 +++ > > tools/perf/util/bpf-filter.c | 55 ++++++++++++---- > > tools/perf/util/bpf-filter.h | 5 +- > > tools/perf/util/bpf-filter.l | 66 +++++++++---------- > > tools/perf/util/bpf-filter.y | 7 +- > > tools/perf/util/bpf_skel/sample-filter.h | 27 +++++++- > > tools/perf/util/bpf_skel/sample_filter.bpf.c | 67 +++++++++++++++----- > > 9 files changed, 172 insertions(+), 70 deletions(-) > > > > -- > > 2.45.0.rc1.225.g2a3ae87e7f-goog > >