Re: Behavior of pinned perf event array

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 24 Sep 2020 13:14:10 -0700

On Thu, Sep 24, 2020 at 1:43 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> On 9/23/20 6:21 PM, Song Liu wrote:
> >> On Sep 14, 2020, at 3:59 PM, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
> >> On Fri, Sep 11, 2020 at 1:36 PM Song Liu <songliubraving@xxxxxx> wrote:
> [...]
> >> Daniel, are you aware of any use cases that do rely on such a behavior
> >> of PERV_EVENT_ARRAY?
> >>
> >> For me this auto-removal of elements on closing *one of a few*
> >> PERF_EVENT_ARRAY FDs (original one, but still just one of a few active
> >> ones) was extremely surprising. It doesn't follow what we do for any
> >> other BPF map, as far as I can tell. E.g., think about
> >> BPF_MAP_TYPE_PROG_ARRAY. If we pin it in BPF FS and close FD, it won't
> >> auto-remove all the tail call programs, right? There is exactly the
> >> same concern with not auto-releasing bpf_progs, just like with
> >> perf_event. But it's not accidental, if you are pinning a BPF map, you
> >> know what you are doing (at least we have to assume so :).
> >>
> >> So instead of adding an extra option, shouldn't we just fix this
> >> behavior instead and make it the same across all BPF maps that hold
> >> kernel resources?
> >
> > Could you please share your thoughts on this? I personally don't have
> > strong preference one way (add a flag) or the other (change the default
> > behavior). But I think we need to agree on the direction to go.
>
> My preference would be to have an opt-in flag, we do rely on the auto-removal
> of the perf event map entry on client close in Cilium at least, e.g. a monitor
> application can insert itself into the map to start receiving events from
> the BPF datapath, and upon exit (no matter whether graceful or not) we don't
> consume any more cycles in the data path than necessary for events, and
> from the __bpf_perf_event_output() we bail out right after checking the
> READ_ONCE(array->ptrs[index]) instead of pushing data that later on no-one
> picks up.

Well, then that's settled. We can't break existing use cases.