> On Mar 1, 2020, at 8:24 PM, Yonghong Song <yhs@xxxxxx> wrote: >> + }, >> + { >> + .name = "instructions", >> + .attr = { >> + .freq = 0, >> + .sample_period = SAMPLE_PERIOD, >> + .inherit = 0, >> + .type = PERF_TYPE_HARDWARE, >> + .read_format = 0, >> + .sample_type = 0, >> + .config = PERF_COUNT_HW_INSTRUCTIONS, >> + }, >> + .ratio_metric = 1, >> + .ratio_mul = 1.0, >> + .ratio_desc = "insn per cycle", >> + }, >> + { >> + .name = "l1d_loads", >> + .attr = { >> + .freq = 0, >> + .sample_period = SAMPLE_PERIOD, >> + .inherit = 0, >> + .type = PERF_TYPE_HW_CACHE, >> + .read_format = 0, >> + .sample_type = 0, >> + .config = >> + PERF_COUNT_HW_CACHE_L1D | >> + (PERF_COUNT_HW_CACHE_OP_READ << 8) | >> + (PERF_COUNT_HW_CACHE_RESULT_ACCESS << 16), >> + }, > > why we do not have metric here? This follows perf-stat design: some events have another event to compare against, like instructions per cycle, etc. > >> + }, >> + { >> + .name = "llc_misses", >> + .attr = { >> + .freq = 0, >> + .sample_period = SAMPLE_PERIOD, >> + .inherit = 0, >> + .type = PERF_TYPE_HW_CACHE, >> + .read_format = 0, >> + .sample_type = 0, >> + .config = >> + PERF_COUNT_HW_CACHE_LL | >> + (PERF_COUNT_HW_CACHE_OP_READ << 8) | >> + (PERF_COUNT_HW_CACHE_RESULT_MISS << 16), >> + }, >> + .ratio_metric = 2, >> + .ratio_mul = 1e6, >> + .ratio_desc = "LLC misses per million isns", >> + }, >> +}; > > icache miss and itlb miss might be useful as well as the code will jump > to a different physical page. I think we should addd them. dtlb_miss > probably not a big problem, but it would be good to be an option. I plan to add more events later on. > > For ratio_metric, we explicitly assign a slot here. Any specific reason? > We can just say this metric *permits* ratio_metric and then ratio_matric > is assigned dynamically by the user command line options? > > I am thinking how we could support *all* metrics the underlying system > support based on `perf list`. This can be the future work though. We are also thinking about adding similar functionality to perf-stat, which will be more flexible. > >> + >> +u64 profile_total_count; > [...] >> + >> + reading_map_fd = bpf_map__fd(obj->maps.accum_readings); >> + count_map_fd = bpf_map__fd(obj->maps.counts); >> + if (reading_map_fd < 0 || count_map_fd < 0) { >> + p_err("failed to get fd for map"); >> + return; >> + } >> + >> + assert(bpf_map_lookup_elem(count_map_fd, &key, counts) == 0); > > In the patch, I see sometime bpf_map_lookup_elem() result is checked > with failure being handled. Sometimes, assert() is used. Maybe be > consistent with checking result approach? Will fix. [...] > >> >> +} >> + >> +#define PROFILE_DEFAULT_LONG_DURATION (3600 * 24) > > We need to let user know this value in "help" at least. > In "man" page it may be get updated but I think we probably > should add there as well. I am planning to just use UINT_MAX. > >> + [...] >> +#define BPF_PROG(name, args...) \ >> +name(unsigned long long *ctx); \ >> +static __always_inline typeof(name(0)) \ >> +____##name(unsigned long long *ctx, ##args); \ >> +typeof(name(0)) name(unsigned long long *ctx) \ >> +{ \ >> + _Pragma("GCC diagnostic push") \ >> + _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ >> + return ____##name(___bpf_ctx_cast(args)); \ >> + _Pragma("GCC diagnostic pop") \ >> +} \ >> +static __always_inline typeof(name(0)) \ >> +____##name(unsigned long long *ctx, ##args) > > I know it is internal. But all the above macros are not great in > a bpf program. If we can reuse/amend current infrastructure. > That will be great. It may benefit users writing a similar > bpf program to here. This is copied from tools/testing/selftests/bpf/bpf_trace_helpers.h. I think we will move them to libbpf later. Then we can use that version instead. Thanks, Song