On Wed, Oct 6, 2021 at 3:27 PM Joanne Koong <joannekoong@xxxxxx> wrote: > > This patch adds the kernel-side changes for the implementation of > a bitset map with bloom filter capabilities. > > The bitset map does not have keys, only values since it is a > non-associative data type. When the bitset map is created, it must > be created with a key_size of 0, and the max_entries value should be the > desired size of the bitset, in number of bits. > > The bitset map supports peek (determining whether a bit is set in the > map), push (setting a bit in the map), and pop (clearing a bit in the > map) operations. These operations are exposed to userspace applications > through the already existing syscalls in the following way: > > BPF_MAP_UPDATE_ELEM -> bpf_map_push_elem > BPF_MAP_LOOKUP_ELEM -> bpf_map_peek_elem > BPF_MAP_LOOKUP_AND_DELETE_ELEM -> bpf_map_pop_elem > > For updates, the user will pass in a NULL key and the index of the > bit to set in the bitmap as the value. For lookups, the user will pass > in the index of the bit to check as the value. If the bit is set, 0 > will be returned, else -ENOENT. For clearing the bit, the user will pass > in the index of the bit to clear as the value. > > Since we need to pass in the index of the bit to set/clear in the bitmap While I can buy that Bloom filter doesn't have a key, bitset clearly has and that key is bit index. Are we all sure that this marriage of bit set and bloom filter is the best API design? > whenever we do a lookup/clear, in the verifier layer this requires us to > modify the argument type of a bitset's BPF_FUNC_map_peek_elem and > BPF_FUNC_map_pop_elem calls to ARG_PTR_TO_MAP_VALUE; correspondingly, in > the syscall layer, we need to copy over the user value so that in > bpf_map_peek_elem and bpf_map_pop_elem, we have the specific > value to check. > > The bitset map may be used as an inner map. > > The bitset map may also have additional bloom filter capabilities. The > lower 4 bits of the map_extra flags for the bitset map denote how many > hash functions to use. If more than 0 hash functions is specified, the > bitset map will operate as a bloom filter where a set of bits are > set/checked where the bits are the results from the bloom filter > functions. Right now, jhash is function used; in the future, this can be > expanded to use map_extra to specify which hash function to use. Please make sure that map_extra is forced to zero for all the existing maps, for future extensibility. > > A few things to additionally please take note of: > * If there are any concurrent lookups + updates in the bloom filter, the > user is responsible for synchronizing this to ensure no false negative > lookups occur. > * Deleting an element in the bloom filter map is not supported. > * The benchmarks later in this patchset can help compare the performance > of using different number of hashes on different entry sizes. In general, > using more hashes increases the false positive rate of an element being probably meant "decreases" here? > detected in the bloom filter but decreases the speed of a lookup. > > Signed-off-by: Joanne Koong <joannekoong@xxxxxx> > --- > include/linux/bpf.h | 2 + > include/linux/bpf_types.h | 1 + > include/uapi/linux/bpf.h | 9 ++ > kernel/bpf/Makefile | 2 +- > kernel/bpf/bitset.c | 256 +++++++++++++++++++++++++++++++++ > kernel/bpf/syscall.c | 25 +++- > kernel/bpf/verifier.c | 10 +- > tools/include/uapi/linux/bpf.h | 9 ++ > 8 files changed, 308 insertions(+), 6 deletions(-) > create mode 100644 kernel/bpf/bitset.c > [...]