On Fri, Jul 15, 2022 at 7:18 AM Jon Doron <arilou@xxxxxxxxx> wrote: > > From: Jon Doron <jond@xxxxxx> > > Add support for writing a custom event reader, by exposing the ring > buffer. > > Few simple examples where this type of needed: > 1. perf_event_read_simple is allocating using malloc, perhaps you want > to handle the wrap-around in some other way. > 2. Since perf buf is per-cpu then the order of the events is not > guarnteed, for example: > Given 3 events where each event has a timestamp t0 < t1 < t2, > and the events are spread on more than 1 CPU, then we can end > up with the following state in the ring buf: > CPU[0] => [t0, t2] > CPU[1] => [t1] > When you consume the events from CPU[0], you could know there is > a t1 missing, (assuming there are no drops, and your event data > contains a sequential index). > So now one can simply do the following, for CPU[0], you can store > the address of t0 and t2 in an array (without moving the tail, so > there data is not perished) then move on the CPU[1] and set the > address of t1 in the same array. > So you end up with something like: > void **arr[] = [&t0, &t1, &t2], now you can consume it orderely > and move the tails as you process in order. > 3. Assuming there are multiple CPUs and we want to start draining the > messages from them, then we can "pick" with which one to start with > according to the remaining free space in the ring buffer. It would be good to mention that with perf_buffer__buffer() you get access to raw mmap()'ed per-CPU underlying memory region, which contains both perf buffer data and header, which in turn contains head/tail positions, so the above scenarios are possible to implement now. > > Signed-off-by: Jon Doron <jond@xxxxxx> > --- > tools/lib/bpf/libbpf.c | 26 ++++++++++++++++++++++++++ > tools/lib/bpf/libbpf.h | 2 ++ > tools/lib/bpf/libbpf.map | 1 + > 3 files changed, 29 insertions(+) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index e89cc9c885b3..250263812194 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -12485,6 +12485,32 @@ int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx) > return cpu_buf->fd; > } > > +/* > + * Return the memory region of a ring buffer in *buf_idx* slot of > + * PERF_EVENT_ARRAY BPF map. This ring buffer can be used to implement > + * a custom events consumer. > + * The ring buffer starts with the *struct perf_event_mmap_page*, which > + * holds the ring buffer managment fields, when accessing the header > + * structure it's important to be SMP aware. > + * You can refer to *perf_event_read_simple* for a simple example. > + */ Please move this comment into libbpf.h to corresponding API declaration site. And make sure it is a proper doc comment (so add descriptions for parameters and return results > +int perf_buffer__buffer(struct perf_buffer *pb, int buf_idx, void **buf, > + size_t *buf_size) nit: if this fits under 100 characters, keep it as a single line > +{ > + struct perf_cpu_buf *cpu_buf; > + > + if (buf_idx >= pb->cpu_cnt) > + return libbpf_err(-EINVAL); > + > + cpu_buf = pb->cpu_bufs[buf_idx]; > + if (!cpu_buf) > + return libbpf_err(-ENOENT); > + > + *buf = cpu_buf->base; > + *buf_size = pb->mmap_size; > + return 0; > +} > + The code itself looks good and straightforward > /* > * Consume data from perf ring buffer corresponding to slot *buf_idx* in > * PERF_EVENT_ARRAY BPF map without waiting/polling. If there is no data to > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h > index 9e9a3fd3edd8..78a7ab8f610a 100644 > --- a/tools/lib/bpf/libbpf.h > +++ b/tools/lib/bpf/libbpf.h > @@ -1381,6 +1381,8 @@ LIBBPF_API int perf_buffer__consume(struct perf_buffer *pb); > LIBBPF_API int perf_buffer__consume_buffer(struct perf_buffer *pb, size_t buf_idx); > LIBBPF_API size_t perf_buffer__buffer_cnt(const struct perf_buffer *pb); > LIBBPF_API int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx); here should be that doc comment > +LIBBPF_API int perf_buffer__buffer(struct perf_buffer *pb, int buf_idx, void **buf, > + size_t *buf_size); > > typedef enum bpf_perf_event_ret > (*bpf_perf_event_print_t)(struct perf_event_header *hdr, > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map > index 52973cffc20c..971072c6dfd8 100644 > --- a/tools/lib/bpf/libbpf.map > +++ b/tools/lib/bpf/libbpf.map > @@ -458,6 +458,7 @@ LIBBPF_0.8.0 { > bpf_program__set_insns; > libbpf_register_prog_handler; > libbpf_unregister_prog_handler; > + perf_buffer__buffer; Current development version is 1.0.0, please put new API in the right section > } LIBBPF_0.7.0; > > LIBBPF_1.0.0 { > -- > 2.36.1 >