On Mon, Apr 27, 2020 at 1:17 PM Yonghong Song <yhs@xxxxxx> wrote: > > Three new libbpf APIs are added to support bpf_iter: > - bpf_program__attach_iter > Given a bpf program and additional parameters, which is > none now, returns a bpf_link. > - bpf_link__create_iter > Given a bpf_link, create a bpf_iter and return a fd > so user can then do read() to get seq_file output data. > - bpf_iter_create > syscall level API to create a bpf iterator. > > Two macros, BPF_SEQ_PRINTF0 and BPF_SEQ_PRINTF, are also introduced. > These two macros can help bpf program writers with > nicer bpf_seq_printf syntax similar to the kernel one. > > Signed-off-by: Yonghong Song <yhs@xxxxxx> > --- > tools/lib/bpf/bpf.c | 11 +++++++ > tools/lib/bpf/bpf.h | 2 ++ > tools/lib/bpf/bpf_tracing.h | 23 ++++++++++++++ > tools/lib/bpf/libbpf.c | 60 +++++++++++++++++++++++++++++++++++++ > tools/lib/bpf/libbpf.h | 11 +++++++ > tools/lib/bpf/libbpf.map | 7 +++++ > 6 files changed, 114 insertions(+) > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c > index 5cc1b0785d18..7ffd6c0ad95f 100644 > --- a/tools/lib/bpf/bpf.c > +++ b/tools/lib/bpf/bpf.c > @@ -619,6 +619,17 @@ int bpf_link_update(int link_fd, int new_prog_fd, > return sys_bpf(BPF_LINK_UPDATE, &attr, sizeof(attr)); > } > > +int bpf_iter_create(int link_fd, unsigned int flags) Do you envision anything more than just flags being passed for bpf_iter_create? I wonder if we should just go ahead with options struct here? > +{ > + union bpf_attr attr; > + > + memset(&attr, 0, sizeof(attr)); > + attr.iter_create.link_fd = link_fd; > + attr.iter_create.flags = flags; > + > + return sys_bpf(BPF_ITER_CREATE, &attr, sizeof(attr)); > +} > + [...] > +/* > + * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values > + * in a structure. BPF_SEQ_PRINTF0 is a simple wrapper for > + * bpf_seq_printf(). > + */ > +#define BPF_SEQ_PRINTF0(seq, fmt) \ > + ({ \ > + int ret = bpf_seq_printf(seq, fmt, sizeof(fmt), \ > + (void *)0, 0); \ > + ret; \ > + }) > + > +#define BPF_SEQ_PRINTF(seq, fmt, args...) \ You can unify BPF_SEQ_PRINTF and BPF_SEQ_PRINTF0 by using ___bpf_empty() macro. See bpf_core_read.h for similar use case. Specifically, look at ___empty (equivalent of ___bpf_empty) and ___core_read, ___core_read0, ___core_readN macro. > + ({ \ > + _Pragma("GCC diagnostic push") \ > + _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ > + __u64 param[___bpf_narg(args)] = { args }; \ Do you need to provide the size of array here? If you omit __bpf_narg(args), wouldn't compiler automatically calculate the right size? Also, can you please use "unsigned long long" to not have any implicit dependency on __u64 being defined? > + _Pragma("GCC diagnostic pop") \ > + int ret = bpf_seq_printf(seq, fmt, sizeof(fmt), \ > + param, sizeof(param)); \ > + ret; \ > + }) > + > #endif > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index 8e1dc6980fac..ffdc4d8e0cc0 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -6366,6 +6366,9 @@ static const struct bpf_sec_def section_defs[] = { > .is_attach_btf = true, > .expected_attach_type = BPF_LSM_MAC, > .attach_fn = attach_lsm), > + SEC_DEF("iter/", TRACING, > + .expected_attach_type = BPF_TRACE_ITER, > + .is_attach_btf = true), It would be nice to implement auto-attach capabilities (similar to fentry/fexit, lsm and raw_tracepoint). Section name should have enough information for this, no? > BPF_PROG_SEC("xdp", BPF_PROG_TYPE_XDP), > BPF_PROG_SEC("perf_event", BPF_PROG_TYPE_PERF_EVENT), > BPF_PROG_SEC("lwt_in", BPF_PROG_TYPE_LWT_IN), > @@ -6629,6 +6632,7 @@ static int bpf_object__collect_struct_ops_map_reloc(struct bpf_object *obj, > [...] > + > + link = calloc(1, sizeof(*link)); > + if (!link) > + return ERR_PTR(-ENOMEM); > + link->detach = &bpf_link__detach_fd; > + > + attach_type = bpf_program__get_expected_attach_type(prog); Given you know it has to be BPF_TRACE_ITER, it's better to explicitly specify that. If provided program wasn't loaded with correct expected_attach_type, kernel will reject it. But if you don't do it, then you can accidentally create some other type of bpf_link. > + link_fd = bpf_link_create(prog_fd, 0, attach_type, NULL); > + if (link_fd < 0) { > + link_fd = -errno; > + free(link); > + pr_warn("program '%s': failed to attach to iterator: %s\n", > + bpf_program__title(prog, false), > + libbpf_strerror_r(link_fd, errmsg, sizeof(errmsg))); > + return ERR_PTR(link_fd); > + } > + link->fd = link_fd; > + return link; > +} > + > +int bpf_link__create_iter(struct bpf_link *link, unsigned int flags) > +{ Same question as for low-level bpf_link_create(). If we expect the need to extend optional things in the future, I'd add opts right now. But I wonder if bpf_link__create_iter() provides any additional value beyond bpf_iter_create(). Maybe let's not add it (yet)? > + char errmsg[STRERR_BUFSIZE]; > + int iter_fd; > + > + iter_fd = bpf_iter_create(bpf_link__fd(link), flags); > + if (iter_fd < 0) { > + iter_fd = -errno; > + pr_warn("failed to create an iterator: %s\n", > + libbpf_strerror_r(iter_fd, errmsg, sizeof(errmsg))); > + } > + > + return iter_fd; > +} > + [...]