On Fri, Jul 5, 2019 at 10:42 PM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 7/5/19 9:35 PM, Andrii Nakryiko wrote: > > BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program > > to user space for additional processing. libbpf already has very low-level API > > to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to > > use and requires a lot of code to set everything up. This patch adds > > perf_buffer abstraction on top of it, abstracting setting up and polling > > per-CPU logic into simple and convenient API, similar to what BCC provides. > > > > perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF > > map entries. It accepts two user-provided callbacks: one for handling raw > > samples and one for get notifications of lost samples due to buffer overflow. > > > > perf_buffer__new_raw() is similar, but provides more control over how > > perf events are set up (by accepting user-provided perf_event_attr), how > > they are handled (perf_event_header pointer is passed directly to > > user-provided callback), and on which CPUs ring buffers are created > > (it's possible to provide a list of CPUs and corresponding map keys to > > update). This API allows advanced users fuller control. > > > > perf_buffer__poll() is used to fetch ring buffer data across all CPUs, > > utilizing epoll instance. > > > > perf_buffer__free() does corresponding clean up and unsets FDs from BPF map. > > > > All APIs are not thread-safe. User should ensure proper locking/coordination if > > used in multi-threaded set up. > > > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > > --- > > tools/lib/bpf/libbpf.c | 366 +++++++++++++++++++++++++++++++++++++++ > > tools/lib/bpf/libbpf.h | 49 ++++++ > > tools/lib/bpf/libbpf.map | 4 + > > 3 files changed, 419 insertions(+) > > > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > > index 2a08eb106221..72149d68b8c1 100644 > > --- a/tools/lib/bpf/libbpf.c > > +++ b/tools/lib/bpf/libbpf.c > > @@ -32,7 +32,9 @@ > > #include <linux/limits.h> > > #include <linux/perf_event.h> > > #include <linux/ring_buffer.h> > > +#include <sys/epoll.h> > > #include <sys/ioctl.h> > > +#include <sys/mman.h> > > #include <sys/stat.h> > > #include <sys/types.h> > > #include <sys/vfs.h> > > @@ -4354,6 +4356,370 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, > > return ret; > > } > > > > +struct perf_buffer; > > + > > +struct perf_buffer_params { > > + struct perf_event_attr *attr; > > + /* if event_cb is specified, it takes precendence */ > > + perf_buffer_event_fn event_cb; > > + /* sample_cb and lost_cb are higher-level common-case callbacks */ > > + perf_buffer_sample_fn sample_cb; > > + perf_buffer_lost_fn lost_cb; > > + void *ctx; > > + int cpu_cnt; > > + int *cpus; > [...] > > + > > +int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms) > > +{ > > + int cnt, err; > > + > > + cnt = epoll_wait(pb->epoll_fd, pb->events, pb->cpu_cnt, timeout_ms); > > + for (int i = 0; i < cnt; i++) { > > Find one compilation error here. > > libbpf.c: In function ‘perf_buffer__poll’: > libbpf.c:4728:2: error: ‘for’ loop initial declarations are only allowed > in C99 mode > for (int i = 0; i < cnt; i++) { > ^ > Ah... Fixing, thanks!. How did you compile? make -C tools/lib/bpf doesn't show this, should we update libbpf Makefile to catch stuff like this? > > + struct perf_cpu_buf *cpu_buf = pb->events[i].data.ptr; > > + > > + err = perf_buffer__process_records(pb, cpu_buf); > > + if (err) { > > + pr_warning("error while processing records: %d\n", err); > > + return err; > > + } > > + } > > + return cnt < 0 ? -errno : cnt; > > +} > > + > > struct bpf_prog_info_array_desc { > > int array_offset; /* e.g. offset of jited_prog_insns */ > > int count_offset; /* e.g. offset of jited_prog_len */ > [...]