On 7/5/19 10:54 PM, Andrii Nakryiko wrote: > On Fri, Jul 5, 2019 at 10:42 PM Yonghong Song <yhs@xxxxxx> wrote: >> >> >> >> On 7/5/19 9:35 PM, Andrii Nakryiko wrote: >>> BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program >>> to user space for additional processing. libbpf already has very low-level API >>> to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to >>> use and requires a lot of code to set everything up. This patch adds >>> perf_buffer abstraction on top of it, abstracting setting up and polling >>> per-CPU logic into simple and convenient API, similar to what BCC provides. >>> >>> perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF >>> map entries. It accepts two user-provided callbacks: one for handling raw >>> samples and one for get notifications of lost samples due to buffer overflow. >>> >>> perf_buffer__new_raw() is similar, but provides more control over how >>> perf events are set up (by accepting user-provided perf_event_attr), how >>> they are handled (perf_event_header pointer is passed directly to >>> user-provided callback), and on which CPUs ring buffers are created >>> (it's possible to provide a list of CPUs and corresponding map keys to >>> update). This API allows advanced users fuller control. >>> >>> perf_buffer__poll() is used to fetch ring buffer data across all CPUs, >>> utilizing epoll instance. >>> >>> perf_buffer__free() does corresponding clean up and unsets FDs from BPF map. >>> >>> All APIs are not thread-safe. User should ensure proper locking/coordination if >>> used in multi-threaded set up. >>> >>> Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> >>> --- >>> tools/lib/bpf/libbpf.c | 366 +++++++++++++++++++++++++++++++++++++++ >>> tools/lib/bpf/libbpf.h | 49 ++++++ >>> tools/lib/bpf/libbpf.map | 4 + >>> 3 files changed, 419 insertions(+) >>> >>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c >>> index 2a08eb106221..72149d68b8c1 100644 >>> --- a/tools/lib/bpf/libbpf.c >>> +++ b/tools/lib/bpf/libbpf.c >>> @@ -32,7 +32,9 @@ >>> #include <linux/limits.h> >>> #include <linux/perf_event.h> >>> #include <linux/ring_buffer.h> >>> +#include <sys/epoll.h> >>> #include <sys/ioctl.h> >>> +#include <sys/mman.h> >>> #include <sys/stat.h> >>> #include <sys/types.h> >>> #include <sys/vfs.h> >>> @@ -4354,6 +4356,370 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, >>> return ret; >>> } >>> >>> +struct perf_buffer; >>> + >>> +struct perf_buffer_params { >>> + struct perf_event_attr *attr; >>> + /* if event_cb is specified, it takes precendence */ >>> + perf_buffer_event_fn event_cb; >>> + /* sample_cb and lost_cb are higher-level common-case callbacks */ >>> + perf_buffer_sample_fn sample_cb; >>> + perf_buffer_lost_fn lost_cb; >>> + void *ctx; >>> + int cpu_cnt; >>> + int *cpus; >> [...] >>> + >>> +int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms) >>> +{ >>> + int cnt, err; >>> + >>> + cnt = epoll_wait(pb->epoll_fd, pb->events, pb->cpu_cnt, timeout_ms); >>> + for (int i = 0; i < cnt; i++) { >> >> Find one compilation error here. >> >> libbpf.c: In function ‘perf_buffer__poll’: >> libbpf.c:4728:2: error: ‘for’ loop initial declarations are only allowed >> in C99 mode >> for (int i = 0; i < cnt; i++) { >> ^ >> > > Ah... Fixing, thanks!. How did you compile? make -C tools/lib/bpf > doesn't show this, should we update libbpf Makefile to catch stuff > like this? I did not make any code changes. My compiler is gcc 4.8.5. it is possible that old compiler less tolerant. >>> + struct perf_cpu_buf *cpu_buf = pb->events[i].data.ptr; >>> + >>> + err = perf_buffer__process_records(pb, cpu_buf); >>> + if (err) { >>> + pr_warning("error while processing records: %d\n", err); >>> + return err; >>> + } >>> + } >>> + return cnt < 0 ? -errno : cnt; >>> +} >>> + >>> struct bpf_prog_info_array_desc { >>> int array_offset; /* e.g. offset of jited_prog_insns */ >>> int count_offset; /* e.g. offset of jited_prog_len */ >> [...]