On Thu, Nov 12, 2020 at 8:12 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, Nov 12, 2020 at 12:58 AM <mariusz.dudek@xxxxxxxxx> wrote: > > > > From: Mariusz Dudek <mariuszx.dudek@xxxxxxxxx> > > > > Add support for separation of eBPF program load and xsk socket > > creation. > > > > This is needed for use-case when you want to privide as little > > privileges as possible to the data plane application that will > > handle xsk socket creation and incoming traffic. > > > > With this patch the data entity container can be run with only > > CAP_NET_RAW capability to fulfill its purpose of creating xsk > > socket and handling packages. In case your umem is larger or > > equal process limit for MEMLOCK you need either increase the > > limit or CAP_IPC_LOCK capability. > > > > To resolve privileges issue two APIs are introduced: > > > > - xsk_setup_xdp_prog - prepares bpf program if given and > > loads it on a selected network interface or loads the built in > > XDP program, if no XDP program is supplied. It can also return > > xsks_map_fd which is needed by unprivileged process to update > > xsks_map with AF_XDP socket "fd" > > > > - xsk_socket__update_xskmap - inserts an AF_XDP socket into an xskmap > > for a particular xsk_socket > > > > Signed-off-by: Mariusz Dudek <mariuszx.dudek@xxxxxxxxx> > > --- > > tools/lib/bpf/libbpf.map | 2 + > > tools/lib/bpf/xsk.c | 160 ++++++++++++++++++++++++++++++++------- > > tools/lib/bpf/xsk.h | 15 ++++ > > 3 files changed, 151 insertions(+), 26 deletions(-) > > > > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map > > index 29ff4807b909..73aa12388055 100644 > > --- a/tools/lib/bpf/libbpf.map > > +++ b/tools/lib/bpf/libbpf.map > > @@ -336,6 +336,8 @@ LIBBPF_0.2.0 { > > perf_buffer__epoll_fd; > > perf_buffer__consume_buffer; > > xsk_socket__create_shared; > > + xsk_setup_xdp_prog; > > + xsk_socket__update_xskmap; > > } LIBBPF_0.1.0; > > > New APIs have to go into LIBBPF_0.3.0 section now. > > > > > LIBBPF_0.3.0 { > > [...] > > > +static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp, > > + struct bpf_prog_cfg_opts *cfg, > > + bool force_set_map, > > + int *xsks_map_fd) > > { > > + struct xsk_socket *xsk = _xdp; > > struct xsk_ctx *ctx = xsk->ctx; > > __u32 prog_id = 0; > > int err; > > @@ -578,14 +633,17 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk) > > return err; > > > > if (!prog_id) { > > - err = xsk_create_bpf_maps(xsk); > > - if (err) > > - return err; > > + if (!cfg || !cfg->insns_cnt) { > > you can't do this, use OPTS_GET() macro to access fields of opts struct. > > > + err = xsk_create_bpf_maps(xsk); > > + if (err) > > + return err; > > + } else { > > + ctx->xsks_map_fd = cfg->xsks_map_fd; > > same > > > + } > > > > - err = xsk_load_xdp_prog(xsk); > > + err = xsk_load_xdp_prog(xsk, cfg); > > if (err) { > > - xsk_delete_bpf_maps(xsk); > > - return err; > > + goto err_load_xdp_prog; > > } > > } else { > > ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id); > > [...] > > > int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, > > const char *ifname, > > __u32 queue_id, struct xsk_umem *umem, > > @@ -838,7 +946,7 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, > > ctx->prog_fd = -1; > > > > if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) { > > - err = xsk_setup_xdp_prog(xsk); > > + err = __xsk_setup_xdp_prog(xsk, NULL, false, NULL); > > if (err) > > goto out_mmap_tx; > > } > > diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h > > index 1069c46364ff..c852ec742437 100644 > > --- a/tools/lib/bpf/xsk.h > > +++ b/tools/lib/bpf/xsk.h > > @@ -201,6 +201,21 @@ struct xsk_umem_config { > > __u32 flags; > > }; > > > > +struct bpf_prog_cfg_opts { > > The name of this struct doesn't really match an API it's used in, it > looks to be related to struct bpf_program, which is extremely > misleading. Why didn't you go with something like > xdp_setup_xdp_prog_opts? > > Also, so far we've been following the convention that non-optional > parameters are passed directly as function arguments, with all the > optional things put into opts. Looking at this struct, prog and > insns_cnt look non-optional. Not sure about the license. Also, it > seems strange to have xsks_map_fd in opts struct and as an output > parameter... If that's really in/out param, you can do OPTS_SET() and > pass it back through the same opts struct. > > But in general, I'm also surprised that this API is using the bpf_insn > array as an input. Do people really construct such a low-level set of > instructions manually? Seems like a rather painful API. > > Björn, Magnus, please take a look as well and chime in on API design > (I have too little context on XSK use cases). Yes, it is much too macho to write an eBPF program directly using assembly :-). My suggestion is to get rid of this whole struct bpf_prog_cfg_opts completely and the parameter in the call. There is no reason to provide functionality to load your own program in this API as it is already so easy using the existing libbpf APIs e.g., with bpf_prog_load_xattr() followed by bpf_set_link_xdp_fd(). So I suggest the following: LIBBPF_API int xsk_setup_default_prog(int ifindex, int *xsks_map_fd); It is strictly a helper function for setting up the default XDP program from a process that has the privilege to do this. Thanks Andrii for taking a look at this and thanks Mariusz for taking a stab at this. > > > + size_t sz; /* size of this struct for forward/backward compatibility */ > > + struct bpf_insn *prog; > > + const char *license; > > + size_t insns_cnt; > > + int xsks_map_fd; > > +}; > > +#define bpf_prog_cfg_opts__last_field xsks_map_fd > > + > > +LIBBPF_API int xsk_setup_xdp_prog(int ifindex, > > + struct bpf_prog_cfg_opts *opts, > > + int *xsks_map_fd); > > +LIBBPF_API int xsk_socket__update_xskmap(struct xsk_socket *xsk, > > + int xsks_map_fd); > > + > > /* Flags for the libbpf_flags field. */ > > #define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0) > > > > -- > > 2.20.1 > >