On Wed, Dec 2, 2020 at 2:39 AM <mariusz.dudek@xxxxxxxxx> wrote: > > From: Mariusz Dudek <mariuszx.dudek@xxxxxxxxx> > > This patch series adds support for separation of eBPF program > load and xsk socket creation. In for example a Kubernetes > environment you can have an AF_XDP CNI or daemonset that is > responsible for launching pods that execute an application > using AF_XDP sockets. It is desirable that the pod runs with > as low privileges as possible, CAP_NET_RAW in this case, > and that all operations that require privileges are contained > in the CNI or daemonset. > > In this case, you have to be able separate ePBF program load from > xsk socket creation. > > Currently, this will not work with the xsk_socket__create APIs > because you need to have CAP_NET_ADMIN privileges to load eBPF > program and CAP_SYS_ADMIN privileges to create update xsk_bpf_maps. > To be exact xsk_set_bpf_maps does not need those privileges but > it takes the prog_fd and xsks_map_fd and those are known only to > process that was loading eBPF program. The api bpf_prog_get_fd_by_id > that looks up the fd of the prog using an prog_id and > bpf_map_get_fd_by_id that looks for xsks_map_fd usinb map_id both > requires CAP_SYS_ADMIN. > > With this patch, the pod can be run with CAP_NET_RAW capability > only. In case your umem is larger or equal process limit for > MEMLOCK you need either increase the limit or CAP_IPC_LOCK capability. > Without this patch in case of insufficient rights ENOPERM is > returned by xsk_socket__create. > > To resolve this privileges issue two new APIs are introduced: > - xsk_setup_xdp_prog - loads the built in XDP program. It can > also return xsks_map_fd which is needed by unprivileged > process to update xsks_map with AF_XDP socket "fd" > - xsk_sokcet__update_xskmap - inserts an AF_XDP socket into an > xskmap for a particular xsk_socket > > Usage example: > int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd) > > int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd); > > Inserts AF_XDP socket "fd" into the xskmap. > > The first patch introduces the new APIs. The second patch provides > a new sample applications working as control and modification to > existing xdpsock application to work with less privileges. > > This patch set is based on bpf-next commit ba0581749fec > (net, xdp, xsk: fix __sk_mark_napi_id_once napi_id error) > > Since v5 > - fixed sample/bpf/xdpsock_user.c to resolve merge conflicts > > Since v4 > - sample/bpf/Makefile issues fixed > > Since v3: > - force_set_map flag removed > - leaking of xsk struct fixed > - unified function error returning policy implemented > > Since v2: > - new APIs moved itto LIBBPF_0.3.0 section > - struct bpf_prog_cfg_opts removed > - loading own eBPF program via xsk_setup_xdp_prog functionality removed > > Since v1: > - struct bpf_prog_cfg improved for backward/forward compatibility > - API xsk_update_xskmap renamed to xsk_socket__update_xskmap > - commit message formatting fixed > > Mariusz Dudek (2): > libbpf: separate XDP program load with xsk socket creation > samples/bpf: sample application for eBPF load and socket creation > split > > samples/bpf/Makefile | 4 +- > samples/bpf/xdpsock.h | 8 ++ > samples/bpf/xdpsock_ctrl_proc.c | 187 ++++++++++++++++++++++++++++++++ > samples/bpf/xdpsock_user.c | 146 +++++++++++++++++++++++-- > tools/lib/bpf/libbpf.map | 2 + > tools/lib/bpf/xsk.c | 92 ++++++++++++++-- > tools/lib/bpf/xsk.h | 5 + > 7 files changed, 425 insertions(+), 19 deletions(-) > create mode 100644 samples/bpf/xdpsock_ctrl_proc.c > > -- > 2.20.1 > Applied to bpf-next. For the future, please carry over Acked-by you got, thanks. I've added Magnus's ones back.