Re: [PATCH v3 bpf-next 1/5] bpf: Add bloom filter map implementation

Martin KaFai Lau <kafai@xxxxxx> · Wed, 22 Sep 2021 15:08:44 -0700

On Wed, Sep 22, 2021 at 01:52:12PM -0700, Andrii Nakryiko wrote:
> > Agree that a generic hash helper is in general useful.  It may be
> > useful in hashing the skb also.  The bpf prog only implementation could
> > have more flexibility in configuring roundup to pow2 or not, how to hash,
> > how many hashes, nr of bits ...etc.  In the mean time, the bpf prog and
> 
> Exactly. If I know better how many bits I need, I'll have to reverse
> engineer kernel's heuristic to provide such max_entries values to
> arrive at the desired amount of memory that Bloom filter will be
> using.
Good point. I don't think it needs to guess.  The formula is stable
and publicly known also.  The formula comment from kernel/bpf/bloom_filter.c
should be moved to the include/uapi/linux/bpf.h.

> > user space need to co-ordinate more and worry about more things,
> > e.g. how to reuse a bloom filter with different nr_hashes,
> > nr_bits, handle synchronization...etc.
> 
> Please see my RFC ([0]). I don't think there is much to coordinate. It
> could be purely BPF-side code, or BPF + user-space initialization
> code, depending on the need. It's a simple and beautiful algorithm,
> which BPF is powerful enough to implement customly and easily.
> 
>   [0] https://lore.kernel.org/bpf/20210922203224.912809-1-andrii@xxxxxxxxxx/T/#t
In practice, the bloom filter will be populated only once by the userspace.

The future update will be done by map-in-map to replace the whole bloom filter.
May be with more max_entries with more nr_hashes.  May be fewer
max_entries with fewer nr_hashes.

Currently, the continuous running bpf prog using this bloom filter does
not need to worry about any change in the newer bloom filter
configure/setup.

I wonder how that may look like in the custom bpf bloom filter in the
bench prog for the map-in-map usage.

> 
> >
> > It is useful to have a default implementation in the kernel
> > for some useful maps like this one that works for most
> > common cases and the bpf user can just use it as get-and-go
> > like all other common bpf maps do.
> 
> I disagree with the premise that Bloom filter is a common and
> generally useful data structure, tbh. It has its nice niche
> applications, but its semantics isn't applicable generally, which is
> why I hesitate to claim that this should live in kernel.
I don't agree the application is nice niche.  I have encountered this
many times when bumping into networking usecase discussion and not
necessary limited to security usage also.  Yes, it is not a link-list
like data structure but its usage is very common.