Re: Behavior of pinned perf event array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 24, 2020, at 1:14 PM, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
> 
> On Thu, Sep 24, 2020 at 1:43 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>> 
>> On 9/23/20 6:21 PM, Song Liu wrote:
>>>> On Sep 14, 2020, at 3:59 PM, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
>>>> On Fri, Sep 11, 2020 at 1:36 PM Song Liu <songliubraving@xxxxxx> wrote:
>> [...]
>>>> Daniel, are you aware of any use cases that do rely on such a behavior
>>>> of PERV_EVENT_ARRAY?
>>>> 
>>>> For me this auto-removal of elements on closing *one of a few*
>>>> PERF_EVENT_ARRAY FDs (original one, but still just one of a few active
>>>> ones) was extremely surprising. It doesn't follow what we do for any
>>>> other BPF map, as far as I can tell. E.g., think about
>>>> BPF_MAP_TYPE_PROG_ARRAY. If we pin it in BPF FS and close FD, it won't
>>>> auto-remove all the tail call programs, right? There is exactly the
>>>> same concern with not auto-releasing bpf_progs, just like with
>>>> perf_event. But it's not accidental, if you are pinning a BPF map, you
>>>> know what you are doing (at least we have to assume so :).
>>>> 
>>>> So instead of adding an extra option, shouldn't we just fix this
>>>> behavior instead and make it the same across all BPF maps that hold
>>>> kernel resources?
>>> 
>>> Could you please share your thoughts on this? I personally don't have
>>> strong preference one way (add a flag) or the other (change the default
>>> behavior). But I think we need to agree on the direction to go.
>> 
>> My preference would be to have an opt-in flag, we do rely on the auto-removal
>> of the perf event map entry on client close in Cilium at least, e.g. a monitor
>> application can insert itself into the map to start receiving events from
>> the BPF datapath, and upon exit (no matter whether graceful or not) we don't
>> consume any more cycles in the data path than necessary for events, and
>> from the __bpf_perf_event_output() we bail out right after checking the
>> READ_ONCE(array->ptrs[index]) instead of pushing data that later on no-one
>> picks up.
> 
> Well, then that's settled. We can't break existing use cases.

Thanks Daniel and Andrii! I will draft the patch with a new flag. 

Song






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux