On Wed, Oct 27, 2021 at 04:44:59PM -0700, Joanne Koong wrote: > This patchset adds a new kind of bpf map: the bloom filter map. > Bloom filters are a space-efficient probabilistic data structure > used to quickly test whether an element exists in a set. > For a brief overview about how bloom filters work, > https://en.wikipedia.org/wiki/Bloom_filter > may be helpful. > > One example use-case is an application leveraging a bloom filter > map to determine whether a computationally expensive hashmap > lookup can be avoided. If the element was not found in the bloom > filter map, the hashmap lookup can be skipped. > > This patchset includes benchmarks for testing the performance of > the bloom filter for different entry sizes and different number of > hash functions used, as well as comparisons for hashmap lookups > with vs. without the bloom filter. > > A high level overview of this patchset is as follows: > 1/5 - kernel changes for adding bloom filter map > 2/5 - libbpf changes for adding map_extra flags > 3/5 - tests for the bloom filter map > 4/5 - benchmarks for bloom filter lookup/update throughput and false positive > rate > 5/5 - benchmarks for how hashmap lookups perform with vs. without the bloom > filter > > v5 -> v6: > * in 1/5: remove "inline" from the hash function, add check in syscall to > fail out in cases where map_extra is not 0 for non-bloom-filter maps, > fix alignment matching issues, move "map_extra flags" comments to inside > the bpf_attr struct, add bpf_map_info map_extra changes here, add map_extra > assignment in bpf_map_get_info_by_fd, change hash value_size to u32 instead of > a u64 > * in 2/5: remove bpf_map_info map_extra changes, remove TODO comment about > extending BTF arrays to cover u64s, cast to unsigned long long for %llx when > printing out map_extra flags > * in 3/5: use __type(value, ...) instead of __uint(value_size, ...) for values > and keys > * in 4/5: fix wrong bounds for the index when iterating through random values, > update commit message to include update+lookup benchmark results for 8 byte > and 64-byte value sizes, remove explicit global bool initializaton to false > for hashmap_use_bloom and count_false_hits variables Thanks! Only have minor comments in patch 1. belated Acked-by: Martin KaFai Lau <kafai@xxxxxx>