On Wed, May 31, 2023 at 11:05:10AM +0000, Anton Protopopov wrote: > static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l) > { > htab_put_fd_value(htab, l); > > + dec_elem_count(htab); > + > if (htab_is_prealloc(htab)) { > check_and_free_fields(htab, l); > __pcpu_freelist_push(&htab->freelist, &l->fnode); > } else { > - dec_elem_count(htab); > htab_elem_free(htab, l); > } > } > @@ -1006,6 +1024,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key, > if (!l) > return ERR_PTR(-E2BIG); > l_new = container_of(l, struct htab_elem, fnode); > + inc_elem_count(htab); The current use_percpu_counter heuristic is far from perfect. It works for some cases, but will surely get bad as the comment next to PERCPU_COUNTER_BATCH is trying to say. Hence, there is a big performance risk doing inc/dec everywhere. Hence, this is a nack: we cannot decrease performance of various maps for few folks who want to see map stats. If you want to see "pressure", please switch cilium to use bpf_mem_alloc htab and use tracing style direct 'struct bpf_htab' access like progs/map_ptr_kern.c is demonstrating. No kernel patches needed. Then bpf_prog_run such tracing prog and read all internal map info. It's less convenient that exposing things in uapi, but not being uapi is the point.