On Tue, Jun 25, 2019 at 7:19 PM Song Liu <liu.song.a23@xxxxxxxxx> wrote: > > On Tue, Jun 25, 2019 at 4:28 PM Andrii Nakryiko <andriin@xxxxxx> wrote: > > > > BPF_MAP_TYPE_PERF_EVENT_ARRAY map is often used to send data from BPF program > > to user space for additional processing. libbpf already has very low-level API > > to read single CPU perf buffer, bpf_perf_event_read_simple(), but it's hard to > > use and requires a lot of code to set everything up. This patch adds > > perf_buffer abstraction on top of it, abstracting setting up and polling > > per-CPU logic into simple and convenient API, similar to what BCC provides. > > > > perf_buffer__new() sets up per-CPU ring buffers and updates corresponding BPF > > map entries. It accepts two user-provided callbacks: one for handling raw > > samples and one for get notifications of lost samples due to buffer overflow. > > > > perf_buffer__poll() is used to fetch ring buffer data across all CPUs, > > utilizing epoll instance. > > > > perf_buffer__free() does corresponding clean up and unsets FDs from BPF map. > > > > All APIs are not thread-safe. User should ensure proper locking/coordination if > > used in multi-threaded set up. > > > > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> > > Overall looks good. Some nit below. Thanks for review! > > > --- > > tools/lib/bpf/libbpf.c | 282 +++++++++++++++++++++++++++++++++++++++ > > tools/lib/bpf/libbpf.h | 12 ++ > > tools/lib/bpf/libbpf.map | 5 +- > > 3 files changed, 298 insertions(+), 1 deletion(-) > > [...] > > > +struct perf_buffer *perf_buffer__new(struct bpf_map *map, size_t page_cnt, > > + perf_buffer_sample_fn sample_cb, > > + perf_buffer_lost_fn lost_cb, void *ctx) > > +{ > > + char msg[STRERR_BUFSIZE]; > > + struct perf_buffer *pb; > > + int err, cpu; > > + > > + if (bpf_map__def(map)->type != BPF_MAP_TYPE_PERF_EVENT_ARRAY) { > > + pr_warning("map '%s' should be BPF_MAP_TYPE_PERF_EVENT_ARRAY\n", > > + bpf_map__name(map)); > > + return ERR_PTR(-EINVAL); > > + } > > + if (bpf_map__fd(map) < 0) { > > + pr_warning("map '%s' doesn't have associated FD\n", > > + bpf_map__name(map)); > > + return ERR_PTR(-EINVAL); > > + } > > + if (page_cnt & (page_cnt - 1)) { > > + pr_warning("page count should be power of two, but is %zu\n", > > + page_cnt); > > + return ERR_PTR(-EINVAL); > > + } > > + > > + pb = calloc(1, sizeof(*pb)); > > + if (!pb) > > + return ERR_PTR(-ENOMEM); > > + > > + pb->sample_cb = sample_cb; > > + pb->lost_cb = lost_cb; > > I think we need to check sample_cb != NULL && lost_cb != NULL. I was thinking about making them all either optional or required, but eventually decided on making sample_cb required and lost_cb optional, as in practice rarely sample_cb wouldn't be provided, while not every application would care about handling lost samples (as there is little you can do about that, except for bumping some counter). As for checking for NULL. I feel like that's overkill. If someone provided NULL for sample_cb, they will get SIGSEGV with stack trace immediately showing that's it's sample_cb being NULL. Unlike Java, C libraries tend not to double-check every pointer for NULL. Checking for things like whether map has FD or is of correct type is valuable, because if you don't check it early, then you'll just eventually get -EINVAL from kernel and will start a guessing game of what's wrong. Checking for callback to be non-null feels unnecessary, as it will be immediately obvious (and it's quite unlikely this will happen in practice). > > > + pb->ctx = ctx; > > + pb->page_size = getpagesize(); > > + pb->mmap_size = pb->page_size * page_cnt; > > + pb->mapfd = bpf_map__fd(map); > > + > > + pb->epfd = epoll_create1(EPOLL_CLOEXEC); > [...] > > +perf_buffer__process_record(struct perf_event_header *e, void *ctx) > > +{ > > + struct perf_buffer *pb = ctx; > > + void *data = e; > > + > > + switch (e->type) { > > + case PERF_RECORD_SAMPLE: { > > + struct perf_sample_raw *s = data; > > + > > + pb->sample_cb(pb->ctx, s->data, s->size); > > + break; > > + } > > + case PERF_RECORD_LOST: { > > + struct perf_sample_lost *s = data; > > + > > + if (pb->lost_cb) > > + pb->lost_cb(pb->ctx, s->lost); > > OK, we test lost_cb here, so not necessary at init time. > > [...] > > bpf_program__attach_perf_event; > > bpf_program__attach_raw_tracepoint; > > bpf_program__attach_tracepoint; > > bpf_program__attach_uprobe; > > + btf__parse_elf; > > Why move btf__parse_elf ? I realized that I haven't put it in correct alphabetical order, decided to fix it here, as it's just a single line change. > > Thanks, > Song