On Wed, Sep 22, 2021 at 01:52:12PM -0700, Andrii Nakryiko wrote: > > Agree that a generic hash helper is in general useful. It may be > > useful in hashing the skb also. The bpf prog only implementation could > > have more flexibility in configuring roundup to pow2 or not, how to hash, > > how many hashes, nr of bits ...etc. In the mean time, the bpf prog and > > Exactly. If I know better how many bits I need, I'll have to reverse > engineer kernel's heuristic to provide such max_entries values to > arrive at the desired amount of memory that Bloom filter will be > using. Good point. I don't think it needs to guess. The formula is stable and publicly known also. The formula comment from kernel/bpf/bloom_filter.c should be moved to the include/uapi/linux/bpf.h. > > user space need to co-ordinate more and worry about more things, > > e.g. how to reuse a bloom filter with different nr_hashes, > > nr_bits, handle synchronization...etc. > > Please see my RFC ([0]). I don't think there is much to coordinate. It > could be purely BPF-side code, or BPF + user-space initialization > code, depending on the need. It's a simple and beautiful algorithm, > which BPF is powerful enough to implement customly and easily. > > [0] https://lore.kernel.org/bpf/20210922203224.912809-1-andrii@xxxxxxxxxx/T/#t In practice, the bloom filter will be populated only once by the userspace. The future update will be done by map-in-map to replace the whole bloom filter. May be with more max_entries with more nr_hashes. May be fewer max_entries with fewer nr_hashes. Currently, the continuous running bpf prog using this bloom filter does not need to worry about any change in the newer bloom filter configure/setup. I wonder how that may look like in the custom bpf bloom filter in the bench prog for the map-in-map usage. > > > > > It is useful to have a default implementation in the kernel > > for some useful maps like this one that works for most > > common cases and the bpf user can just use it as get-and-go > > like all other common bpf maps do. > > I disagree with the premise that Bloom filter is a common and > generally useful data structure, tbh. It has its nice niche > applications, but its semantics isn't applicable generally, which is > why I hesitate to claim that this should live in kernel. I don't agree the application is nice niche. I have encountered this many times when bumping into networking usecase discussion and not necessary limited to security usage also. Yes, it is not a link-list like data structure but its usage is very common.