Brian Vazquez has proposed BPF_MAP_DUMP command to look up more than one map entries per syscall. https://lore.kernel.org/bpf/CABCgpaU3xxX6CMMxD+1knApivtc2jLBHysDXw-0E9bQEL0qC3A@xxxxxxxxxxxxxx/T/#t During discussion, we found more use cases can be supported in a similar map operation batching framework. For example, batched map lookup and delete, which can be really helpful for bcc. https://github.com/iovisor/bcc/blob/master/tools/tcptop.py#L233-L243 https://github.com/iovisor/bcc/blob/master/tools/slabratetop.py#L129-L138 Also, in bcc, we have API to delete all entries in a map. https://github.com/iovisor/bcc/blob/master/src/cc/api/BPFTable.h#L257-L264 For map update, batched operations also useful as sometimes applications need to populate initial maps with more than one entry. For example, the below example is from kernel/samples/bpf/xdp_redirect_cpu_user.c: https://github.com/torvalds/linux/blob/master/samples/bpf/xdp_redirect_cpu_user.c#L543-L550 This patch addresses all the above use cases. To make uapi stable, it also covers other potential use cases. Four bpf syscall subcommands are introduced: BPF_MAP_LOOKUP_BATCH BPF_MAP_LOOKUP_AND_DELETE_BATCH BPF_MAP_UPDATE_BATCH BPF_MAP_DELETE_BATCH In userspace, application can iterate through the whole map one batch as a time, e.g., bpf_map_lookup_batch() in the below: p_key = NULL; p_next_key = &key; while (true) { err = bpf_map_lookup_batch(fd, p_key, &p_next_key, keys, values, &batch_size, elem_flags, flags); if (err) ... if (p_next_key) break; // done if (!p_key) p_key = p_next_key; } Please look at individual patches for details of new syscall subcommands and examples of user codes. The testing is also done in a qemu VM environment: measure_lookup: max_entries 1000000, batch 10, time 342ms measure_lookup: max_entries 1000000, batch 1000, time 295ms measure_lookup: max_entries 1000000, batch 1000000, time 270ms measure_lookup: max_entries 1000000, no batching, time 1346ms measure_lookup_delete: max_entries 1000000, batch 10, time 433ms measure_lookup_delete: max_entries 1000000, batch 1000, time 363ms measure_lookup_delete: max_entries 1000000, batch 1000000, time 357ms measure_lookup_delete: max_entries 1000000, not batch, time 1894ms measure_delete: max_entries 1000000, batch, time 220ms measure_delete: max_entries 1000000, not batch, time 1289ms For a 1M entry hash table, batch size of 10 can reduce cpu time by 70%. Please see patch "tools/bpf: measure map batching perf" for details of test codes. Brian Vazquez (1): bpf: add bpf_map_value_size and bp_map_copy_value helper functions Yonghong Song (12): bpf: refactor map_update_elem() bpf: refactor map_delete_elem() bpf: refactor map_get_next_key() bpf: adding map batch processing support tools/bpf: sync uapi header bpf.h tools/bpf: implement libbpf API functions for map batch operations tools/bpf: add test for bpf_map_update_batch() tools/bpf: add test for bpf_map_lookup_batch() tools/bpf: add test for bpf_map_lookup_and_delete_batch() tools/bpf: add test for bpf_map_delete_batch() tools/bpf: add a multithreaded test for map batch operations tools/bpf: measure map batching perf include/uapi/linux/bpf.h | 27 + kernel/bpf/syscall.c | 752 ++++++++++++++---- tools/include/uapi/linux/bpf.h | 27 + tools/lib/bpf/bpf.c | 67 ++ tools/lib/bpf/bpf.h | 17 + tools/lib/bpf/libbpf.map | 4 + .../selftests/bpf/map_tests/map_batch_mt.c | 126 +++ .../selftests/bpf/map_tests/map_batch_perf.c | 242 ++++++ .../bpf/map_tests/map_delete_batch.c | 139 ++++ .../map_tests/map_lookup_and_delete_batch.c | 164 ++++ .../bpf/map_tests/map_lookup_batch.c | 166 ++++ .../bpf/map_tests/map_update_batch.c | 115 +++ 12 files changed, 1707 insertions(+), 139 deletions(-) create mode 100644 tools/testing/selftests/bpf/map_tests/map_batch_mt.c create mode 100644 tools/testing/selftests/bpf/map_tests/map_batch_perf.c create mode 100644 tools/testing/selftests/bpf/map_tests/map_delete_batch.c create mode 100644 tools/testing/selftests/bpf/map_tests/map_lookup_and_delete_batch.c create mode 100644 tools/testing/selftests/bpf/map_tests/map_lookup_batch.c create mode 100644 tools/testing/selftests/bpf/map_tests/map_update_batch.c -- 2.17.1