Re: Question about bpf perfbuf/ringbuf: pinned in backend with overwriting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



BTW, if anyone's interested, there's a "benchmark" trace event when you
enable:

  CONFIG_TRACEPOINT_BENCHMARK=y

You can see the code in: kernel/trace/trace_benchmark.c:

Which does a loop of:

	local_irq_disable();
	start = trace_clock_local();
	trace_benchmark_event(bm_str, bm_last);
	stop = trace_clock_local();
	local_irq_enable();

Where it writes the result of the previous timings into the current trace
event via the bm_str:


	delta = stop - start;

	[..]

	bm_last = delta;

	[..]

	scnprintf(bm_str, BENCHMARK_EVENT_STRLEN,
		  "last=%llu first=%llu max=%llu min=%llu avg=%u std=%d std^2=%lld",
		  bm_last, bm_first, bm_max, bm_min, avg, std, stddev);




I ran: perf record -a -e benchmark:benchmark_event sleep 20

and perf script produces (I scrolled down to get to hot cache):

 event_benchmark    2289 [001]   672.581425: benchmark:benchmark_event: last=247 first=5693 max=8969 min=204 avg=240 std=234 std^2=55157 delta=247
 event_benchmark    2289 [001]   672.581426: benchmark:benchmark_event: last=222 first=5693 max=8969 min=204 avg=240 std=234 std^2=55151 delta=222
 event_benchmark    2289 [001]   672.581427: benchmark:benchmark_event: last=229 first=5693 max=8969 min=204 avg=240 std=234 std^2=55144 delta=229
 event_benchmark    2289 [001]   672.581427: benchmark:benchmark_event: last=221 first=5693 max=8969 min=204 avg=240 std=234 std^2=55138 delta=221
 event_benchmark    2289 [001]   672.581428: benchmark:benchmark_event: last=223 first=5693 max=8969 min=204 avg=240 std=234 std^2=55131 delta=223
 event_benchmark    2289 [001]   672.581428: benchmark:benchmark_event: last=220 first=5693 max=8969 min=204 avg=240 std=234 std^2=55125 delta=220
 event_benchmark    2289 [001]   672.581429: benchmark:benchmark_event: last=215 first=5693 max=8969 min=204 avg=240 std=234 std^2=55118 delta=215
 event_benchmark    2289 [001]   672.581430: benchmark:benchmark_event: last=221 first=5693 max=8969 min=204 avg=240 std=234 std^2=55112 delta=221
 event_benchmark    2289 [001]   672.581430: benchmark:benchmark_event: last=240 first=5693 max=8969 min=204 avg=240 std=234 std^2=55105 delta=240
 event_benchmark    2289 [001]   672.581431: benchmark:benchmark_event: last=225 first=5693 max=8969 min=204 avg=240 std=234 std^2=55099 delta=225
 event_benchmark    2289 [001]   672.581432: benchmark:benchmark_event: last=235 first=5693 max=8969 min=204 avg=240 std=234 std^2=55092 delta=235
 event_benchmark    2289 [001]   672.581432: benchmark:benchmark_event: last=220 first=5693 max=8969 min=204 avg=240 std=234 std^2=55086 delta=220
 event_benchmark    2289 [001]   672.581433: benchmark:benchmark_event: last=245 first=5693 max=8969 min=204 avg=240 std=234 std^2=55079 delta=245
 event_benchmark    2289 [001]   672.581433: benchmark:benchmark_event: last=215 first=5693 max=8969 min=204 avg=240 std=234 std^2=55073 delta=215
 event_benchmark    2289 [001]   672.581434: benchmark:benchmark_event: last=216 first=5693 max=8969 min=204 avg=240 std=234 std^2=55066 delta=216


For ftrace: trace-cmd record -e benchmark_event sleep 20

trace-cmd report:

 event_benchmark-2253  [000]   549.747068: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=78
 event_benchmark-2253  [000]   549.747069: benchmark_event:      last=79 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=79
 event_benchmark-2253  [000]   549.747069: benchmark_event:      last=72 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=72
 event_benchmark-2253  [000]   549.747069: benchmark_event:      last=79 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=79
 event_benchmark-2253  [000]   549.747070: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=78
 event_benchmark-2253  [000]   549.747070: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=78
 event_benchmark-2253  [000]   549.747071: benchmark_event:      last=79 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=79
 event_benchmark-2253  [000]   549.747071: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=78
 event_benchmark-2253  [000]   549.747072: benchmark_event:      last=80 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=80
 event_benchmark-2253  [000]   549.747072: benchmark_event:      last=79 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=79
 event_benchmark-2253  [000]   549.747073: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=933 delta=78
 event_benchmark-2253  [000]   549.747073: benchmark_event:      last=165 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=165
 event_benchmark-2253  [000]   549.747074: benchmark_event:      last=79 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=79
 event_benchmark-2253  [000]   549.747074: benchmark_event:      last=153 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=153
 event_benchmark-2253  [000]   549.747075: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=78
 event_benchmark-2253  [000]   549.747075: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=78
 event_benchmark-2253  [000]   549.747076: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=935 delta=78
 event_benchmark-2253  [000]   549.747076: benchmark_event:      last=73 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=73
 event_benchmark-2253  [000]   549.747077: benchmark_event:      last=79 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=79
 event_benchmark-2253  [000]   549.747077: benchmark_event:      last=78 first=2674 max=1185 min=71 avg=84 std=30 std^2=934 delta=78


For normal tracing, the average perf event takes 204ns per event, and the
average ftrace event takes 84ns per event. The "first" in the output above
is how long the first event took (which was cold cache).

I added filtering to trace-cmd with:

 trace-cmd record -o trace-filter.dat -e benchmark_event -f 'delta & 1' sleep 20

I should modify the event to have a counter so that I can filter every
other event with that, but for now I just print out anything that has an
odd delta.

 event_benchmark-2548  [000]  1558.776493: benchmark_event:       str=last=199 first=1964 max=2215 min=40 avg=78 std=44 std^2=2022 delta=199
 event_benchmark-2548  [000]  1558.776498: benchmark_event:       str=last=43 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=43
 event_benchmark-2548  [000]  1558.776498: benchmark_event:       str=last=191 first=1964 max=2215 min=40 avg=78 std=44 std^2=2022 delta=191
 event_benchmark-2548  [000]  1558.776500: benchmark_event:       str=last=41 first=1964 max=2215 min=40 avg=78 std=44 std^2=2022 delta=41
 event_benchmark-2548  [000]  1558.776500: benchmark_event:       str=last=119 first=1964 max=2215 min=40 avg=78 std=44 std^2=2022 delta=119
 event_benchmark-2548  [000]  1558.776501: benchmark_event:       str=last=41 first=1964 max=2215 min=40 avg=78 std=44 std^2=2022 delta=41
 event_benchmark-2548  [000]  1558.776502: benchmark_event:       str=last=105 first=1964 max=2215 min=40 avg=78 std=44 std^2=2022 delta=105
 event_benchmark-2548  [000]  1558.776503: benchmark_event:       str=last=41 first=1964 max=2215 min=40 avg=78 std=44 std^2=2022 delta=41
 event_benchmark-2548  [000]  1558.776505: benchmark_event:       str=last=41 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=41
 event_benchmark-2548  [000]  1558.776505: benchmark_event:       str=last=111 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=111
 event_benchmark-2548  [000]  1558.776506: benchmark_event:       str=last=109 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=109
 event_benchmark-2548  [000]  1558.776506: benchmark_event:       str=last=109 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=109
 event_benchmark-2548  [000]  1558.776508: benchmark_event:       str=last=41 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=41
 event_benchmark-2548  [000]  1558.776508: benchmark_event:       str=last=109 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=109
 event_benchmark-2548  [000]  1558.776509: benchmark_event:       str=last=117 first=1964 max=2215 min=40 avg=78 std=44 std^2=2021 delta=117
 event_benchmark-2548  [000]  1558.776510: benchmark_event:       str=last=51 first=1964 max=2215 min=40 avg=78 std=44 std^2=2020 delta=51
 event_benchmark-2548  [000]  1558.776510: benchmark_event:       str=last=103 first=1964 max=2215 min=40 avg=78 std=44 std^2=2020 delta=103
 event_benchmark-2548  [000]  1558.776511: benchmark_event:       str=last=109 first=1964 max=2215 min=40 avg=78 std=44 std^2=2020 delta=109
 event_benchmark-2548  [000]  1558.776512: benchmark_event:       str=last=51 first=1964 max=2215 min=40 avg=78 std=44 std^2=2020 delta=51
 event_benchmark-2548  [000]  1558.776512: benchmark_event:       str=last=103 first=1964 max=2215 min=40 avg=78 std=44 std^2=2020 delta=103
 event_benchmark-2548  [000]  1558.776513: benchmark_event:       str=last=95 first=1964 max=2215 min=40 avg=78 std=44 std^2=2020 delta=95
 event_benchmark-2548  [000]  1558.776513: benchmark_event:       str=last=101 first=1964 max=2215 min=40 avg=78 std=44 std^2=2020 delta=101
 event_benchmark-2548  [000]  1558.776514: benchmark_event:       str=last=109 first=1964 max=2215 min=40 avg=78 std=44 std^2=2019 delta=109

It looks like throwing away an event is around 40-50ns, where as now the
copying of the event into a temp buffer before writing it into the ring
buffer increased the time from 84ns to around 100-110ns. Still half the
time it takes for the perf event.

The above trace event benchmark has been part of the Linux kernel since
3.16, so everyone should have it if you want to run your own tests.

-- Steve





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux