On 7/5/19 9:35 PM, Andrii Nakryiko wrote: > BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program > to user space for additional processing. libbpf already has very low-level API > to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to > use and requires a lot of code to set everything up. This patch adds > perf_buffer abstraction on top of it, abstracting setting up and polling > per-CPU logic into simple and convenient API, similar to what BCC provides. > > perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF > map entries. It accepts two user-provided callbacks: one for handling raw > samples and one for get notifications of lost samples due to buffer overflow. > > perf_buffer__new_raw() is similar, but provides more control over how > perf events are set up (by accepting user-provided perf_event_attr), how > they are handled (perf_event_header pointer is passed directly to > user-provided callback), and on which CPUs ring buffers are created > (it's possible to provide a list of CPUs and corresponding map keys to > update). This API allows advanced users fuller control. > > perf_buffer__poll() is used to fetch ring buffer data across all CPUs, > utilizing epoll instance. > > perf_buffer__free() does corresponding clean up and unsets FDs from BPF map. > > All APIs are not thread-safe. User should ensure proper locking/coordination if > used in multi-threaded set up. > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > --- > tools/lib/bpf/libbpf.c | 366 +++++++++++++++++++++++++++++++++++++++ > tools/lib/bpf/libbpf.h | 49 ++++++ > tools/lib/bpf/libbpf.map | 4 + > 3 files changed, 419 insertions(+) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index 2a08eb106221..72149d68b8c1 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -32,7 +32,9 @@ > #include <linux/limits.h> > #include <linux/perf_event.h> > #include <linux/ring_buffer.h> > +#include <sys/epoll.h> > #include <sys/ioctl.h> > +#include <sys/mman.h> > #include <sys/stat.h> > #include <sys/types.h> > #include <sys/vfs.h> > @@ -4354,6 +4356,370 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, > return ret; > } > > +struct perf_buffer; > + > +struct perf_buffer_params { > + struct perf_event_attr *attr; > + /* if event_cb is specified, it takes precendence */ > + perf_buffer_event_fn event_cb; > + /* sample_cb and lost_cb are higher-level common-case callbacks */ > + perf_buffer_sample_fn sample_cb; > + perf_buffer_lost_fn lost_cb; > + void *ctx; > + int cpu_cnt; > + int *cpus; [...] > + > +int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms) > +{ > + int cnt, err; > + > + cnt = epoll_wait(pb->epoll_fd, pb->events, pb->cpu_cnt, timeout_ms); > + for (int i = 0; i < cnt; i++) { Find one compilation error here. libbpf.c: In function ‘perf_buffer__poll’: libbpf.c:4728:2: error: ‘for’ loop initial declarations are only allowed in C99 mode for (int i = 0; i < cnt; i++) { ^ > + struct perf_cpu_buf *cpu_buf = pb->events[i].data.ptr; > + > + err = perf_buffer__process_records(pb, cpu_buf); > + if (err) { > + pr_warning("error while processing records: %d\n", err); > + return err; > + } > + } > + return cnt < 0 ? -errno : cnt; > +} > + > struct bpf_prog_info_array_desc { > int array_offset; /* e.g. offset of jited_prog_insns */ > int count_offset; /* e.g. offset of jited_prog_len */ [...]