On Mon, Sep 5, 2022 at 8:06 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Sun, 4 Sep 2022 18:32:58 -0700 > Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > Page allocations (overheads are compared to get_free_pages() duration): > > 6.8% Codetag counter manipulations (__lazy_percpu_counter_add + __alloc_tag_add) > > 8.8% lookup_page_ext > > 1237% call stack capture > > 139% tracepoint with attached empty BPF program > > Have you tried tracepoint with custom callback? > > static void my_callback(void *data, unsigned long call_site, > const void *ptr, struct kmem_cache *s, > size_t bytes_req, size_t bytes_alloc, > gfp_t gfp_flags) > { > struct my_data_struct *my_data = data; > > { do whatever } > } > > [..] > register_trace_kmem_alloc(my_callback, my_data); > > Now the my_callback function will be called directly every time the > kmem_alloc tracepoint is hit. > > This avoids that perf and BPF overhead. Haven't tried that yet but will do. Thanks for the reference code! > > -- Steve