Re: [RFC v2 PATCH bpf-next 0/4] bpf: add percpu stats for bpf_map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 23, 2023 at 2:53 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> On 6/22/23 11:53 AM, Anton Protopopov wrote:
> > This series adds a mechanism for maps to populate per-cpu counters of elements
> > on insertions/deletions. The sum of these counters can be accessed by a new
> > kfunc from a map iterator program.
> >
> > The following patches are present in the series:
> >
> >    * Patch 1 adds a generic per-cpu counter to struct bpf_map
> >    * Patch 2 utilizes this mechanism for hash-based maps
> >    * Patch 3 extends the preloaded map iterator to dump the sum
> >    * Patch 4 adds a self-test for the change
> >
> > The reason for adding this functionality in our case (Cilium) is to get
> > signals about how full some heavy-used maps are and what the actual dynamic
> > profile of map capacity is. In the case of LRU maps this is impossible to get
> > this information anyhow else. See also [1].
> >
> > This is a v2 for the https://lore.kernel.org/bpf/20230531110511.64612-1-aspsk@xxxxxxxxxxxxx/T/#t
> > It was rewritten according to previous comments.  I've turned this series into
> > an RFC for two reasons:
> >
> > 1) This patch only works on systems where this_cpu_{inc,dec} is atomic for s64.
> > For systems which might write s64 non-atomically this would be required to use
> > some locking mechanism to prevent readers from reading trash via the
> > bpf_map_sum_elements_counter() kfunc (see patch 1)
> >
> > 2) In comparison with the v1, we're adding extra instructions per map operation
> > (for preallocated maps, as well as for non-preallocated maps). The only
> > functionality we're interested at the moment is the number of elements present
> > in a map, not a per-cpu statistics. This could be better achieved by using
> > the v1 version, which only adds computations for preallocated maps.
> >
> > So, the question is: won't it be fine to do the changes in the following way:
> >
> >    * extend the preallocated hash maps to populate percpu batch counters as in v1
> >    * add a kfunc as in v2 to get the current sum
> >
> > This works as
> >
> >    * nobody at the moment actually requires the per-cpu statistcs
> >    * this implementation can be transparently turned into per-cpu statistics, if
> >      such a need occurs on practice (the only thing to change would be to
> >      re-implement the kfunc and, maybe, add more kfuncs to get per-cpu stats)
> >    * the "v1 way" is the least intrusive: it only affects preallocated maps, as
> >      other maps already provide the required functionality
> >
> >    [1] https://lpc.events/event/16/contributions/1368/
> >
> > v1 -> v2:
> > - make the counters generic part of struct bpf_map
> > - don't use map_info and /proc/self/fdinfo in favor of a kfunc
>
> Tbh, I did like v1 approach a bit better. We are trying to bend over backwards just
> so that we don't add things to uapi, but in the end we are also adding it to the
> maps.debug, etc (yes, it has .debug in the name and all) ...

I think we should keep bending even more backwards to avoid uapi burden.
All new features should be non-uapi no matter how many people
say "I'll definitely use it".

> or as an extensible bpf_map_info stats extension in case there is some agreement?

I'd rather not.
bpf_map_info returns what user space sent to the kernel earlier.
stats or anything that user space didn't explicitly give earlier
is quite different.
Same goes for bpf_prog_info.
We only added verified_insns there that doesn't fit the above definition
and it was a mistake. After almost 2 years it is still unused
and cannot be removed.
veristat is parsing non-uapi verifier log.
Tooling can live with non-uapi map stats just fine.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux