On 20/11/2024 00:51, Peter Zijlstra wrote:
On Tue, Nov 19, 2024 at 06:45:57AM -0800, Vadim Fedorenko wrote:
On 19/11/2024 03:47, Peter Zijlstra wrote:
On Mon, Nov 18, 2024 at 10:52:45AM -0800, Vadim Fedorenko wrote:
+int bpf_cpu_cycles(void)
+{
+ struct bpf_pidns_info pidns;
+ __u64 start;
+
+ start = bpf_get_cpu_cycles();
+ bpf_get_ns_current_pid_tgid(0, 0, &pidns, sizeof(struct bpf_pidns_info));
+ cycles = bpf_get_cpu_cycles() - start;
+ ns = bpf_cpu_cycles_to_ns(cycles);
+ return 0;
+}
Oh, the intent is to use that cycles_to_ns() on deltas. That wasn't at
all clear.
Yep, that's the main use case, it was discussed in the previous
versions of the patchset.
Should bloody well be in the changelogs then. As is I'm tempted to NAK
the entire series because there is not a single word on WHY for any of
this.
Sure, I'll add this info in the next version.
Anyway, the above has more problems than just bad naming. TSC is
constant and not affected by DVFS, so depending on the DVFS state of
things your function will return wildly different readings.
Why should I care about DVFS? The use case is to measure the time spent
in some code. If we replace it with bpf_ktime_get_ns(), it will also be
affected by DVFS, and it's fine. We will be able to see the difference
for different DVFS states.
Again, this goes to usage, why do you want this, what are you going to
do with the results?
Run-ro-run numbers will be absolutely useless because of DVFS.
We do a lot of measurements of bpf programs and kernel stack functions
at scale. We can filter out variations due to DVFS as well as slice the
results by the HW generations, etc. In general, we do see benefits of
the values we gather. The goal of this patchset is to optimize the
overhead added by bpf_ktime_get_ns(). Andrii has already shown the
benefit in [1]. TL;DR - it's about 35-40% faster than the pair of
bpf_ktime_get_ns(). And helps to save a lot of CPU at scale.
[1]
https://lore.kernel.org/bpf/CAEf4BzaRb+fUK17wrj4sWnYM5oKxTvwZC=U-GjvsdUtF94PqrA@xxxxxxxxxxxxxx/