On Sat, Aug 1, 2020 at 9:59 AM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 7/31/20 9:57 PM, Brian Vazquez wrote: > > While running some experiments it was observed that map_lookup_batch was much > > slower than get_next_key + lookup when the syscall overhead is minimal. > > This was because the map_lookup_batch implementation was more expensive > > traversing empty buckets, this can be really costly when the pre-allocated > > map is too big. > > > > This patch optimizes the case when the bucket is empty so we can move quickly > > to next bucket. > > > > The benchmark to exercise this is as follows: > > > > -The map was populate with a single entry to make sure that the syscall overhead > > is not helping the map_batch_lookup. > > -The size of the preallocated map was increased to show the effect of > > traversing empty buckets. > > > > Results: > > > > Using get_next_key + lookup: > > > > Benchmark Time(ns) CPU(ns) Iteration > > --------------------------------------------------------------- > > BM_DumpHashMap/1/1k 3593 3586 192680 > > BM_DumpHashMap/1/4k 6004 5972 100000 > > BM_DumpHashMap/1/16k 15755 15710 44341 > > BM_DumpHashMap/1/64k 59525 59376 10000 > > I think "BM_DumpHashMap/1/64k" means the program "BM_DumpHashMap", > the map having only "1" entry, and the map preallocated size is "64k"? > What is the "Iteration" here? The number of runs with the same dump? > The CPU(ns) is the system cpu consumption, right? The Time/CPU is for > all iterations, not just one, right? It would be good > if the above results can be described better, so people can > understand the results better. > Hi Yonghong, thanks for reviewing it! I'll fix it in next iteration. > > > > Using htab_lookup_batch before this patch: > > Benchmark Time(ns) CPU(ns) Iterations > > --------------------------------------------------------------- > > BM_DumpHashMap/1/1k 3933 3927 177978 > > BM_DumpHashMap/1/4k 9192 9177 73951 > > BM_DumpHashMap/1/16k 42011 41970 16789 > > BM_DumpHashMap/1/64k 117895 117661 6135 > > > > Using htab_lookup_batch with this patch: > > Benchmark Time(ns) CPU(ns) Iterations > > --------------------------------------------------------------- > > BM_DumpHashMap/1/1k 2809 2803 249212 > > BM_DumpHashMap/1/4k 5318 5316 100000 > > BM_DumpHashMap/1/16k 14925 14895 47448 > > BM_DumpHashMap/1/64k 58870 58674 10000 > > > > Suggested-by: Luigi Rizzo <lrizzo@xxxxxxxxxx> > > Cc: Yonghong Song <yhs@xxxxxx> > > Signed-off-by: Brian Vazquez <brianvv@xxxxxxxxxx> > > --- > > kernel/bpf/hashtab.c | 23 ++++++++--------------- > > 1 file changed, 8 insertions(+), 15 deletions(-) > > > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c > > index 2137e2200d95..150015ea6737 100644 > > --- a/kernel/bpf/hashtab.c > > +++ b/kernel/bpf/hashtab.c > > @@ -1351,7 +1351,6 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, > > struct hlist_nulls_head *head; > > struct hlist_nulls_node *n; > > unsigned long flags = 0; > > - bool locked = false; > > struct htab_elem *l; > > struct bucket *b; > > int ret = 0; > > @@ -1410,19 +1409,19 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, > > dst_val = values; > > b = &htab->buckets[batch]; > > head = &b->head; > > - /* do not grab the lock unless need it (bucket_cnt > 0). */ > > - if (locked) > > - flags = htab_lock_bucket(htab, b); > > > > + l = hlist_nulls_entry_safe(rcu_dereference_raw(hlist_nulls_first_rcu(head)), > > + struct htab_elem, hash_node); > > + if (!l && (batch + 1 < htab->n_buckets)) { > > + batch++; > > + goto again_nocopy; > > + } > > + > > + flags = htab_lock_bucket(htab, b); > [...]