Re: [PATCH bpf-next 05/13] bpf: adding map batch processing support

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Thu, 29 Aug 2019 23:58:36 -0700

On Fri, Aug 30, 2019 at 06:39:48AM +0000, Yonghong Song wrote:
> > 
> > The problem happens when you are trying to do batch lookup on a
> > hashmap and when executing bpf_map_get_next_key(map, key, next_key)
> > the key is removed, then that call will return the first key and you'd
> > start iterating the map from the beginning again and retrieve
> > duplicate information.
> 
> Right. Maybe we can have another bpf_map_ops callback function
> like 'map_batch_get_next_key' which won't fall back to the
> first key if the 'key' is not available in the hash table?

The reason I picked this get_next_key behavior long ago
because I couldn't come up with a way to pick the next key consistently.
In the hash table the elements are not sorted.
If there are more than one element in the hash table bucket
they are added to the link list in sort-of random order.
If one out of N elems in the bucket are deleted which one should be
picked next?
select_bucket() picks the bucket.
if lookup_nulls_elem_raw() cannot find the element which one in
the link list is the "right one" to continue?
Iterating over hash table without duplicates when elements
are being added and removed in parallel is a hard problem to solve.
I think "best effort" is the right answer.
When users care about consistency they should use map-in-map.