On 2024/10/10 12:53, Tengda Wu wrote: > > > On 2024/10/10 8:31, Namhyung Kim wrote: >> On Wed, Oct 09, 2024 at 10:18:44AM -0700, Song Liu wrote: >>> On Sun, Sep 15, 2024 at 6:53 PM Tengda Wu <wutengda@xxxxxxxxxxxxxxx> wrote: >>>> >>>> bperf has a nice ability to share PMUs, but it still does not support >>>> inherit events during fork(), resulting in some deviations in its stat >>>> results compared with perf. >>>> >>>> perf stat result: >>>> $ ./perf stat -e cycles,instructions -- ./perf test -w sqrtloop >>>> >>>> Performance counter stats for './perf test -w sqrtloop': >>>> >>>> 2,316,038,116 cycles >>>> 2,859,350,725 instructions >>>> >>>> 1.009603637 seconds time elapsed >>>> >>>> 1.004196000 seconds user >>>> 0.003950000 seconds sys >>>> >>>> bperf stat result: >>>> $ ./perf stat --bpf-counters -e cycles,instructions -- \ >>>> ./perf test -w sqrtloop >>>> >>>> Performance counter stats for './perf test -w sqrtloop': >>>> >>>> 18,762,093 cycles >>>> 23,487,766 instructions >>>> >>>> 1.008913769 seconds time elapsed >>>> >>>> 1.003248000 seconds user >>>> 0.004069000 seconds sys >>>> >>>> In order to support event inheritance, two new bpf programs are added >>>> to monitor the fork and exit of tasks respectively. When a task is >>>> created, add it to the filter map to enable counting, and reuse the >>>> `accum_key` of its parent task to count together with the parent task. >>>> When a task exits, remove it from the filter map to disable counting. >>>> >>>> After support: >>>> $ ./perf stat --bpf-counters -e cycles,instructions -- \ >>>> ./perf test -w sqrtloop >>>> >>>> Performance counter stats for './perf test -w sqrtloop': >>>> >>>> 2,316,252,189 cycles >>>> 2,859,946,547 instructions >>>> >>>> 1.009422314 seconds time elapsed >>>> >>>> 1.003597000 seconds user >>>> 0.004270000 seconds sys >>>> >>>> Signed-off-by: Tengda Wu <wutengda@xxxxxxxxxxxxxxx> >>> >>> The solution looks good to me. Question on the UI: do we always >>> want the inherit behavior from PID and TGID monitoring? If not, >>> maybe we should add a flag for it. (I think we do need the flag). >> >> I think it should depend on the value of attr.inherit. Maybe we can >> disable the autoload for !inherit. >> > > Got it. The attr.inherit flag(related to --no-inherit in perf command) > is suitable for controlling inherit behavior. I will fix it. Thanks! > >>> >>> One nitpick below. >>> >>> Thanks, >>> Song >>> >>> [...] >>>> >>>> +struct bperf_filter_value { >>>> + __u32 accum_key; >>>> + __u8 exited; >>>> +}; >>> nit: >>> Can we use a special value of accum_key to replace exited==1 >>> case? >> >> I'm not sure. I guess it still needs to use the accum_key to save the >> final value when exited = 1. > > In theory, it is possible. The accum_key is currently only used to index value > in accum_readings map, so if the task is not being counted, the accum_key can > be set to an special value. > > Due to accum_key is of u32 type, there are two special values to choose from: 0 > or max_entries+1. I think the latter, max_entries+1, may be more suitable because > it can avoid memory waste in the accum_readings map and does not require too > many changes to bpf_counter. > Sorry, I was wrong. As Namhyung said, 'accum_readings[accum_key]' saves the last count of the task when it exits. If accum_key is set to a special value at this time, the count will be lost. So exited==1 is necessary, we can not use a special value of accum_key to replace it. Thanks, Tengda > >> >> Thanks, >> Namhyung >> >>> >>>> + >>>> #endif /* __BPERF_STAT_U_H */ >>>> -- >>>> 2.34.1 >>>> >