On Fri, Sep 27, 2024 at 10:35:54AM +0800, Tengda Wu wrote: > > > On 2024/9/26 12:16, Namhyung Kim wrote: > > On Wed, Sep 25, 2024 at 01:55:22PM +0000, Tengda Wu wrote: > >> bperf restricts the size of perf_attr_map's entries to 16, which > >> cannot hold all events in many scenarios. A typical example is > >> when the user specifies `-a -ddd` ([0]). And in other cases such as > >> top-down analysis, which often requires a set of more than 16 PMUs > >> to be collected simultaneously. > >> > >> Fix this by increase perf_attr_map entries to 100, and an event > >> number check has been introduced when bperf__load() to ensure that > >> users receive a more friendly prompt when the event limit is reached. > >> > >> [0] https://lore.kernel.org/all/20230104064402.1551516-3-namhyung@xxxxxxxxxx/ > > > > Apparently this patch was never applied. I don't know how much you need > > but having too many events at the same time won't be very useful because > > multiplexing could reduce the accuracy. > > > > Could you please explain why patch [0] was not merged at that time? I couldn't > find this information from the previous emails. I guess it's just fell through the crack. :) > > In my scenario, we collect more than 40+ events to support necessary metric > calculations, which multiplexing is inevitable. Although multiplexing may > reduce accuracy, for the purpose of supporting metric calculations, these > accuracy losses can be acceptable. Perf also has the same issue with multiplexing. > Removing the event limit for bperf can provide users with additional options. > > In addition to accuracy, we also care about overhead. I compared the overhead > of bperf and perf by testing ./lat_ctx in lmbench [1], and found that the > overhead of bperf stat is about 4% less than perf. This is why we choose to > use bperf in some extreme scenarios. Ok, thanks for explanation. I think it's ok to increase the limit. Thanks, Namhyung > > [1] https://github.com/intel/lmbench > > Thanks, > Tengda > > > > >> > >> Fixes: 7fac83aaf2ee ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF") > >> Signed-off-by: Tengda Wu <wutengda@xxxxxxxxxxxxxxx> > >> --- > >> tools/perf/util/bpf_counter.c | 8 +++++++- > >> 1 file changed, 7 insertions(+), 1 deletion(-) > >> > >> diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c > >> index 7a8af60e0f51..3346129c20cf 100644 > >> --- a/tools/perf/util/bpf_counter.c > >> +++ b/tools/perf/util/bpf_counter.c > >> @@ -28,7 +28,7 @@ > >> #include "bpf_skel/bperf_leader.skel.h" > >> #include "bpf_skel/bperf_follower.skel.h" > >> > >> -#define ATTR_MAP_SIZE 16 > >> +#define ATTR_MAP_SIZE 100 > >> > >> static inline void *u64_to_ptr(__u64 ptr) > >> { > >> @@ -451,6 +451,12 @@ static int bperf__load(struct evsel *evsel, struct target *target) > >> enum bperf_filter_type filter_type; > >> __u32 filter_entry_cnt, i; > >> > >> + if (evsel->evlist->core.nr_entries > ATTR_MAP_SIZE) { > >> + pr_err("Too many events, please limit to %d or less\n", > >> + ATTR_MAP_SIZE); > >> + return -1; > >> + } > >> + > >> if (bperf_check_target(evsel, target, &filter_type, &filter_entry_cnt)) > >> return -1; > >> > >> -- > >> 2.34.1 > >> >