On Wed, Oct 23, 2024 at 02:24:13PM -0700, Andrii Nakryiko wrote: > On Wed, Oct 23, 2024 at 1:32 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > On Wed, Oct 23, 2024 at 12:13:31PM -0700, Andrii Nakryiko wrote: > > > On Wed, Oct 23, 2024 at 11:47 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > > > > > Hello, > > > > > > > > On Wed, Oct 23, 2024 at 09:12:52AM -0700, Andrii Nakryiko wrote: > > > > > On Tue, Oct 22, 2024 at 5:09 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > > > > > > > > > Like in the software events, the BPF overflow handler can drop samples > > > > > > by returning 0. Let's count the dropped samples here too. > > > > > > > > > > > > Acked-by: Kyle Huey <me@xxxxxxxxxxxx> > > > > > > Cc: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > > > Cc: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > > > Cc: Song Liu <song@xxxxxxxxxx> > > > > > > Cc: bpf@xxxxxxxxxxxxxxx > > > > > > Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx> > > > > > > --- > > > > > > kernel/events/core.c | 4 +++- > > > > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c > > > > > > index 5d24597180dec167..b41c17a0bc19f7c2 100644 > > > > > > --- a/kernel/events/core.c > > > > > > +++ b/kernel/events/core.c > > > > > > @@ -9831,8 +9831,10 @@ static int __perf_event_overflow(struct perf_event *event, > > > > > > ret = __perf_event_account_interrupt(event, throttle); > > > > > > > > > > > > if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT && > > > > > > - !bpf_overflow_handler(event, data, regs)) > > > > > > + !bpf_overflow_handler(event, data, regs)) { > > > > > > + atomic64_inc(&event->dropped_samples); > > > > > > > > > > I don't see the full patch set (please cc relevant people and mailing > > > > > list on each patch in the patch set), but do we really want to pay the > > > > > > > > Sorry, you can find the whole series here. > > > > > > > > https://lore.kernel.org/lkml/20241023000928.957077-1-namhyung@xxxxxxxxxx > > > > > > > > I thought it's mostly for the perf part so I didn't CC bpf folks but > > > > I'll do in the next version. > > > > > > > > > > > > > price of atomic increment on what's the very typical situation of a > > > > > BPF program returning 0? > > > > > > > > Is it typical for BPF_PROG_TYPE_PERF_EVENT? I guess TRACING programs > > > > usually return 0 but PERF_EVENT should care about the return values. > > > > > > > > > > Yeah, it's pretty much always `return 0;` for perf_event-based BPF > > > profilers. It's rather unusual to return non-zero, actually. > > > > Ok, then it may be local_t or plain unsigned long. It should be > > updated on overflow only. Read can be racy but I think it's ok to > > miss some numbers. If someone really needs a precise count, they can > > read it after disabling the event IMHO. > > > > What do you think? > > > > See [0] for unsynchronized increment absolutely killing the > performance due to cache line bouncing between CPUs. If the event is > high-frequency and can be triggered across multiple CPUs in short > succession, even an imprecise counter might be harmful. Ok. > > In general, I'd say that if BPF attachment is involved, we probably > should avoid maintaining unnecessary statistics. Things like this > event->dropped_samples increment can be done very efficiently and > trivially from inside the BPF program, if at all necessary. Right, we can do that in the BPF too. > > Having said that, if it's unlikely to have perf_event bouncing between > multiple CPUs, it's probably not that big of a deal. Yeah, perf_event is dedicated to a CPU or a task and the counter is updated only in the overflow handler. So I don't think it'd cause cache line bouncing between CPUs. Thanks, Namhyung > > > [0] https://lore.kernel.org/linux-trace-kernel/20240813203409.3985398-1-andrii@xxxxxxxxxx/ > > > Thanks, > > Namhyung > >