On Fri, Oct 8, 2021 at 4:05 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Wed, Oct 6, 2021 at 3:27 PM Joanne Koong <joannekoong@xxxxxx> wrote: > > > > This patch adds the kernel-side changes for the implementation of > > a bitset map with bloom filter capabilities. > > > > The bitset map does not have keys, only values since it is a > > non-associative data type. When the bitset map is created, it must > > be created with a key_size of 0, and the max_entries value should be the > > desired size of the bitset, in number of bits. > > > > The bitset map supports peek (determining whether a bit is set in the > > map), push (setting a bit in the map), and pop (clearing a bit in the > > map) operations. These operations are exposed to userspace applications > > through the already existing syscalls in the following way: > > > > BPF_MAP_UPDATE_ELEM -> bpf_map_push_elem > > BPF_MAP_LOOKUP_ELEM -> bpf_map_peek_elem > > BPF_MAP_LOOKUP_AND_DELETE_ELEM -> bpf_map_pop_elem > > > > For updates, the user will pass in a NULL key and the index of the > > bit to set in the bitmap as the value. For lookups, the user will pass > > in the index of the bit to check as the value. If the bit is set, 0 > > will be returned, else -ENOENT. For clearing the bit, the user will pass > > in the index of the bit to clear as the value. > > > > Since we need to pass in the index of the bit to set/clear in the bitmap > > While I can buy that Bloom filter doesn't have a key, bitset clearly > has and that key is bit index. Are we all sure that this marriage of > bit set and bloom filter is the best API design? > Adding on to this, as a user of the bpf helpers, peek, push, and pop have meaning to me as operating on data structures for which those are well defined, such as stacks and queues. For bitsets, "push"ing a bit doesn't make sense to me as a concept, so needing to use bpf_map_push_elem to set a bit is not intuitive to me. bpf_map_lookup_elem, bpf_map_update_elem, and bpf_map_delete_elem make more sense to me (though if we have update, delete is redundant). > > whenever we do a lookup/clear, in the verifier layer this requires us to > > modify the argument type of a bitset's BPF_FUNC_map_peek_elem and > > BPF_FUNC_map_pop_elem calls to ARG_PTR_TO_MAP_VALUE; correspondingly, in > > the syscall layer, we need to copy over the user value so that in > > bpf_map_peek_elem and bpf_map_pop_elem, we have the specific > > value to check. > > > > The bitset map may be used as an inner map. > > > > The bitset map may also have additional bloom filter capabilities. The > > lower 4 bits of the map_extra flags for the bitset map denote how many > > hash functions to use. If more than 0 hash functions is specified, the > > bitset map will operate as a bloom filter where a set of bits are > > set/checked where the bits are the results from the bloom filter > > functions. Right now, jhash is function used; in the future, this can be > > expanded to use map_extra to specify which hash function to use. > > Please make sure that map_extra is forced to zero for all the existing > maps, for future extensibility. > > > > > A few things to additionally please take note of: > > * If there are any concurrent lookups + updates in the bloom filter, the > > user is responsible for synchronizing this to ensure no false negative > > lookups occur. > > * Deleting an element in the bloom filter map is not supported. > > * The benchmarks later in this patchset can help compare the performance > > of using different number of hashes on different entry sizes. In general, > > using more hashes increases the false positive rate of an element being > > probably meant "decreases" here? > > > detected in the bloom filter but decreases the speed of a lookup. > > > > Signed-off-by: Joanne Koong <joannekoong@xxxxxx> > > --- > > include/linux/bpf.h | 2 + > > include/linux/bpf_types.h | 1 + > > include/uapi/linux/bpf.h | 9 ++ > > kernel/bpf/Makefile | 2 +- > > kernel/bpf/bitset.c | 256 +++++++++++++++++++++++++++++++++ > > kernel/bpf/syscall.c | 25 +++- > > kernel/bpf/verifier.c | 10 +- > > tools/include/uapi/linux/bpf.h | 9 ++ > > 8 files changed, 308 insertions(+), 6 deletions(-) > > create mode 100644 kernel/bpf/bitset.c > > > > [...]