On Tue, Sep 03, 2019 at 04:07:17PM -0700, Brian Vazquez wrote: > > We could also modify get_next_key behaviour _only_ when it's called > from a dumping function in which case we do know that we want to move > forward not backwards (basically if prev_key is not found, then > retrieve the first key in the next bucket). > > That approach might miss entries that are in the same bucket where the > prev_key used to be, but missing entries is something that can always > happen (new additions in previous buckets), we can not control that > and as it has said before, if users care about consistency they can > use map-in-map. for dump-all case such miss of elements might be ok-ish, but for delete-all it's probably not. Imagine bpf prog is doing delete and bcc script is doing delete-all. With 'go to the next bucket' logic there will be a case where both kernel and user side are doing delete, but some elements are still left. True that map-in-map is the answer, but if we're doing new get_next op we probably should do it with less corner cases. > > This all requires new per-map implementations unfortunately :-( > > We were trying to see if we can somehow improve the existing bpf_map_ops > > to be more friendly towards batching. > > Agreed that one of the motivations for current batching implementation > was to re use the existing code as much as we can. Although this might > be a good opportunity to do some per-map implementations as long as > they behave better than the current ones. I don't think non reuse of get_next is a big deal. Adding another get_next_v2 to all maps is a low cost comparing with advantages of having stable iterate api. stable walk over hash map is imo more important feature of api than perf improvement brought by batching.