On Thu, Nov 12, 2020 at 12:58 AM <mariusz.dudek@xxxxxxxxx> wrote: > > From: Mariusz Dudek <mariuszx.dudek@xxxxxxxxx> > > Add support for separation of eBPF program load and xsk socket > creation. > > This is needed for use-case when you want to privide as little > privileges as possible to the data plane application that will > handle xsk socket creation and incoming traffic. > > With this patch the data entity container can be run with only > CAP_NET_RAW capability to fulfill its purpose of creating xsk > socket and handling packages. In case your umem is larger or > equal process limit for MEMLOCK you need either increase the > limit or CAP_IPC_LOCK capability. > > To resolve privileges issue two APIs are introduced: > > - xsk_setup_xdp_prog - prepares bpf program if given and > loads it on a selected network interface or loads the built in > XDP program, if no XDP program is supplied. It can also return > xsks_map_fd which is needed by unprivileged process to update > xsks_map with AF_XDP socket "fd" > > - xsk_socket__update_xskmap - inserts an AF_XDP socket into an xskmap > for a particular xsk_socket > > Signed-off-by: Mariusz Dudek <mariuszx.dudek@xxxxxxxxx> > --- > tools/lib/bpf/libbpf.map | 2 + > tools/lib/bpf/xsk.c | 160 ++++++++++++++++++++++++++++++++------- > tools/lib/bpf/xsk.h | 15 ++++ > 3 files changed, 151 insertions(+), 26 deletions(-) > > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map > index 29ff4807b909..73aa12388055 100644 > --- a/tools/lib/bpf/libbpf.map > +++ b/tools/lib/bpf/libbpf.map > @@ -336,6 +336,8 @@ LIBBPF_0.2.0 { > perf_buffer__epoll_fd; > perf_buffer__consume_buffer; > xsk_socket__create_shared; > + xsk_setup_xdp_prog; > + xsk_socket__update_xskmap; > } LIBBPF_0.1.0; New APIs have to go into LIBBPF_0.3.0 section now. > > LIBBPF_0.3.0 { [...] > +static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp, > + struct bpf_prog_cfg_opts *cfg, > + bool force_set_map, > + int *xsks_map_fd) > { > + struct xsk_socket *xsk = _xdp; > struct xsk_ctx *ctx = xsk->ctx; > __u32 prog_id = 0; > int err; > @@ -578,14 +633,17 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk) > return err; > > if (!prog_id) { > - err = xsk_create_bpf_maps(xsk); > - if (err) > - return err; > + if (!cfg || !cfg->insns_cnt) { you can't do this, use OPTS_GET() macro to access fields of opts struct. > + err = xsk_create_bpf_maps(xsk); > + if (err) > + return err; > + } else { > + ctx->xsks_map_fd = cfg->xsks_map_fd; same > + } > > - err = xsk_load_xdp_prog(xsk); > + err = xsk_load_xdp_prog(xsk, cfg); > if (err) { > - xsk_delete_bpf_maps(xsk); > - return err; > + goto err_load_xdp_prog; > } > } else { > ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id); [...] > int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, > const char *ifname, > __u32 queue_id, struct xsk_umem *umem, > @@ -838,7 +946,7 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, > ctx->prog_fd = -1; > > if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) { > - err = xsk_setup_xdp_prog(xsk); > + err = __xsk_setup_xdp_prog(xsk, NULL, false, NULL); > if (err) > goto out_mmap_tx; > } > diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h > index 1069c46364ff..c852ec742437 100644 > --- a/tools/lib/bpf/xsk.h > +++ b/tools/lib/bpf/xsk.h > @@ -201,6 +201,21 @@ struct xsk_umem_config { > __u32 flags; > }; > > +struct bpf_prog_cfg_opts { The name of this struct doesn't really match an API it's used in, it looks to be related to struct bpf_program, which is extremely misleading. Why didn't you go with something like xdp_setup_xdp_prog_opts? Also, so far we've been following the convention that non-optional parameters are passed directly as function arguments, with all the optional things put into opts. Looking at this struct, prog and insns_cnt look non-optional. Not sure about the license. Also, it seems strange to have xsks_map_fd in opts struct and as an output parameter... If that's really in/out param, you can do OPTS_SET() and pass it back through the same opts struct. But in general, I'm also surprised that this API is using the bpf_insn array as an input. Do people really construct such a low-level set of instructions manually? Seems like a rather painful API. Björn, Magnus, please take a look as well and chime in on API design (I have too little context on XSK use cases). > + size_t sz; /* size of this struct for forward/backward compatibility */ > + struct bpf_insn *prog; > + const char *license; > + size_t insns_cnt; > + int xsks_map_fd; > +}; > +#define bpf_prog_cfg_opts__last_field xsks_map_fd > + > +LIBBPF_API int xsk_setup_xdp_prog(int ifindex, > + struct bpf_prog_cfg_opts *opts, > + int *xsks_map_fd); > +LIBBPF_API int xsk_socket__update_xskmap(struct xsk_socket *xsk, > + int xsks_map_fd); > + > /* Flags for the libbpf_flags field. */ > #define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0) > > -- > 2.20.1 >