> On Mar 17, 2020, at 1:03 PM, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > On 3/17/20 8:54 PM, Song Liu wrote: >>> On Mar 17, 2020, at 12:30 PM, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: >>> On 3/16/20 9:33 PM, Song Liu wrote: >>>> Currently, sysctl kernel.bpf_stats_enabled controls BPF runtime stats. >>>> Typical userspace tools use kernel.bpf_stats_enabled as follows: >>>> 1. Enable kernel.bpf_stats_enabled; >>>> 2. Check program run_time_ns; >>>> 3. Sleep for the monitoring period; >>>> 4. Check program run_time_ns again, calculate the difference; >>>> 5. Disable kernel.bpf_stats_enabled. >>>> The problem with this approach is that only one userspace tool can toggle >>>> this sysctl. If multiple tools toggle the sysctl at the same time, the >>>> measurement may be inaccurate. >>>> To fix this problem while keep backward compatibility, introduce a new >>>> bpf command BPF_ENABLE_RUNTIME_STATS. On success, this command enables >>>> run_time_ns stats and returns a valid fd. >>>> With BPF_ENABLE_RUNTIME_STATS, user space tool would have the following >>>> flow: >>>> 1. Get a fd with BPF_ENABLE_RUNTIME_STATS, and make sure it is valid; >>>> 2. Check program run_time_ns; >>>> 3. Sleep for the monitoring period; >>>> 4. Check program run_time_ns again, calculate the difference; >>>> 5. Close the fd. >>>> Signed-off-by: Song Liu <songliubraving@xxxxxx> >>> >>> Hmm, I see no relation to /dev/bpf_stats anymore, yet the subject still talks >>> about it? >> My fault. Will fix.. >>> Also, should this have bpftool integration now that we have `bpftool prog profile` >>> support? Would be nice to then fetch the related stats via bpf_prog_info, so users >>> can consume this in an easy way. >> We can add "run_time_ns" as a metric to "bpftool prog profile". But the >> mechanism is not the same though. Let me think about this. > > Hm, true as well. Wouldn't long-term extending "bpftool prog profile" fentry/fexit > programs supersede this old bpf_stats infrastructure? Iow, can't we implement the > same (or even more elaborate stats aggregation) in BPF via fentry/fexit and then > potentially deprecate bpf_stats counters? I think run_time_ns has its own value as a simple monitoring framework. We can use it in tools like top (and variations). It will be easier for these tools to adopt run_time_ns than using fentry/fexit. On the other hand, in long term, we may include a few fentry/fexit based programs in the kernel binary (or the rpm), so that these tools can use them easily. At that time, we can fully deprecate run_time_ns. Maybe this is not too far away? Thanks, Song