Commit 47c09d6a9f67("bpftool: Introduce "prog profile" command") introduced "bpftool prog profile" command which can be used to profile bpf program with metrics like # of instructions, This patch added support for itlb_misses and dtlb_misses. During an internal bpf program performance evaluation, I found these two metrics are also very useful. The following is an example output: $ bpftool prog profile id 324 duration 3 cycles itlb_misses 1885029 run_cnt 5134686073 cycles 306893 itlb_misses $ bpftool prog profile id 324 duration 3 cycles dtlb_misses 1827382 run_cnt 4943593648 cycles 5975636 dtlb_misses $ bpftool prog profile id 324 duration 3 cycles llc_misses 1836527 run_cnt 5019612972 cycles 4161041 llc_misses >From the above, we can see quite some dtlb misses, 3 dtlb misses perf prog run. This might be something worth further investigation. Cc: Song Liu <songliubraving@xxxxxx> Signed-off-by: Yonghong Song <yhs@xxxxxx> --- tools/bpf/bpftool/prog.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index acdb2c245f0a..e33f27b950a5 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -1717,11 +1717,39 @@ struct profile_metric { .ratio_desc = "LLC misses per million insns", .ratio_mul = 1e6, }, + { + .name = "itlb_misses", + .attr = { + .type = PERF_TYPE_HW_CACHE, + .config = + PERF_COUNT_HW_CACHE_ITLB | + (PERF_COUNT_HW_CACHE_OP_READ << 8) | + (PERF_COUNT_HW_CACHE_RESULT_MISS << 16), + .exclude_user = 1 + }, + .ratio_metric = 2, + .ratio_desc = "itlb misses per million insns", + .ratio_mul = 1e6, + }, + { + .name = "dtlb_misses", + .attr = { + .type = PERF_TYPE_HW_CACHE, + .config = + PERF_COUNT_HW_CACHE_DTLB | + (PERF_COUNT_HW_CACHE_OP_READ << 8) | + (PERF_COUNT_HW_CACHE_RESULT_MISS << 16), + .exclude_user = 1 + }, + .ratio_metric = 2, + .ratio_desc = "dtlb misses per million insns", + .ratio_mul = 1e6, + }, }; static __u64 profile_total_count; -#define MAX_NUM_PROFILE_METRICS 4 +#define MAX_NUM_PROFILE_METRICS 6 static int profile_parse_metrics(int argc, char **argv) { @@ -2109,7 +2137,7 @@ static int do_help(int argc, char **argv) " struct_ops | fentry | fexit | freplace | sk_lookup }\n" " ATTACH_TYPE := { msg_verdict | stream_verdict | stream_parser |\n" " flow_dissector }\n" - " METRIC := { cycles | instructions | l1d_loads | llc_misses }\n" + " METRIC := { cycles | instructions | l1d_loads | llc_misses | itlb_misses | dtlb_misses }\n" " " HELP_SPEC_OPTIONS "\n" "", bin_name, argv[-2]); -- 2.24.1