Toke Høiland-Jørgensen <toke@xxxxxxxxxx> [Fri, 2020-02-28 06:22 -0800]: > Hi everyone > > As most of you are no doubt aware, we've had various discussions on how > to handle multiple XDP programs on a single interface. With the freplace > functionality, the kernel infrastructure is now there to handle this > (almost, see "missing pieces" below). > > While the freplace mechanism offers userspace a lot of flexibility in > how to handle dispatching of multiple XDP programs, userspace also has > to do quite a bit of work to implement this (compared to just issuing > load+attach). The goal of this email is to get some feedback on a > library to implement this, in the hope that we can converge on something > that will be widely applicable, ensuring that both (a) everyone doesn't > have to reinvent the wheel, and (b) we don't end up with a proliferation > of subtly incompatible dispatcher models that makes it hard or > impossible to mix and match XDP programs from multiple sources. > > My proposal for the beginnings of such a library is in the xdp-tools repository > on Github, in the 'xdp-multi-prog' branch. > > To clone and compile simply do this: > > $ git clone --recurse-submodules -b xdp-multi-prog https://github.com/xdp-project/xdp-tools > $ cd xdp-tools && ./configure && make > > See lib/libxdp/libxdp.c for the library implementation, and xdp-loader/ for a > command-line loader that supports loading multiple programs in one go using the > dispatch (just supply it multiple filenames on the command line). There are > still some missing bits, marked with FIXME comments in the code, and discussed > below. > > I'm also including libxdp as a patch in the next email, but only to facilitate > easy commenting on the code; use the version of Github if you actually want to > compile and play with it. > > The overall goal of the library is to *make the simple case easy* but retain > enough flexibility for custom applications to specify their own load order etc > where needed. The "simple case" here being to just load one or more XDP programs > onto an interface while retaining any programs that may already be loaded there. > > **** Program metadata > > To do this, I propose two pieces of metadata that an XDP program can specify for > itself, which will serve as defaults to guide the loading: > > - Its *run priority*: This is simply an integer priority number that will be > used to sort programs when building the dispatcher. The inspiration is > old-style rc init scripts, where daemons are started in numerical order on > boot (e.g., /etc/rc.d/S50sshd). My hope here is that we can establish a > convention around ranges of priorities that make sense for different types of > programs; e.g., packet filters would use low priorities, and programs that > want to monitor the traffic on the host will use high priorities, etc. > > - Its *chain call actions*: These are the return codes for which the next > program should be called. The idea here is that a program can indicate which > actions it makes sense to continue operating on; the default is just XDP_PASS, > and I expect this would be the most common case. > > The metadata is specified using BTF, using a syntax similar to BTF-defined maps, > i.e.: > > struct { > __uint(priority, 10); > __uint(XDP_PASS, 1); // chain call on XDP_PASS... > __uint(XDP_ROP, 1); // ...and on XDP_DROP > } XDP_RUN_CONFIG(FUNCNAME); > > (where FUNCNAME is the function name of the XDP program this config refers to). > > Using BTF for this ensures that the metadata stays with the program in the > object file. And because this becomes part of the object BTF, it will be loaded > into the kernel and is thus also retrievable for loaded programs. > > The libxdp loaded will use the run priority to sort XDP programs before loading, > and it will use the chain call actions to configure the dispatcher program. Note > that the values defined metadata only serve as a default, though; the user > should be able to override the values on load to sort programs in an arbitrary > order. > > **** The dispatcher program > The dispatcher program is a simple XDP program that is generated from a template > to just implement a series of dispatch statements like these: > > if (num_progs_enabled < 1) > goto out; > ret = prog0(ctx); > if (!((1 << ret) & conf.chain_call_actions[0])) > return ret; > > if (num_progs_enabled < 2) > goto out; > ret = prog1(ctx); > if (!((1 << ret) & conf.chain_call_actions[1])) > return ret; > > [...] > > The num_progs_enabled and conf.chain_call_actions variables are static const > global variables, which means that the compiler will put them into the .rodata > section, allowing the kernel to perform dead code elimination if the > num_progs_enabled check fails. libxdp will set the values based on the program > metadata before loading the dispatcher, the use freplace to put the actual > component programs into the placeholders specified by prog0, prog1, etc. > > The dispatcher program makes liberal use of variables marked as 'volatile' to > prevent the compiler from optimising out the checks and calls to the dummy > functions. > > **** Missing pieces > While the libxdp code can assemble a basic dispatcher and load it into the > kernel, there are a couple of missing pieces on the kernel side; I will propose > patches to fix these, but figured there was no reason to hold back posting of > the library for comments because of this. These missing pieces are: > > - There is currently no way to persist the freplace after the program exits; the > file descriptor returned by bpf_raw_tracepoint_open() will release the program > when it is closed, and it cannot be pinned. > > - There is no way to re-attach an already loaded program to another function; > this is needed for updating the call sequence: When a new program is loaded, > libxdp should get the existing list of component programs on the interface and > insert the new one into the chain in the appropriate place. To do this it > needs to build a new dispatcher and reattach all the old programs to it. > Ideally, this should be doable without detaching them from the old dispatcher; > that way, we can build the new dispatcher completely, and atomically replace > it on the interface by the usual XDP attach mechanism. Hey Toke, Thanks for posting it. I'm still reading the library on github and may have more thoughts later, but wanted to provide early feedback on one high level thing. The main challenges I see for applying this approach in fb case is the need to recreate the dispatcher every time a new program has to be added. Imagine there there are a few containers and every container wants to run an application that attaches XDP program to the "dispatcher" via freplace. Every application may have a "priority" reserved for it, but recreating the dispatcher may have race condition, for example: - dispatcher exists in some stable stage with programs p1, p2 attached to it; - app A starts and gets the set of progs currently attached to dispatcher: p1, p2 (yeah, this kernel API doesn't exist, as you mentioned but even if it's added ..); - app A adds another program pA to the list and generates dispatcher with the new set: p1, p2, pA; - app B starts, gets same list from dispatcher: p1, p2 and adds its own pB program and generates new dispatcher. - now the end result depends on the order in which app A and app B attach corresponding new dispatcher to the interface, but in any case dispatcher won't have either pA or pB. Yes, some central coordination mechanism may be added to "lock" the lib while recreating the dispatcher, but it may not be container-environment friendly. Also I see at least one other way to do it w/o regenerating dispatcher every time: It can be created and attached once with "big enough" number of slots, for example with 100 and programs may use use their corresponding slot to freplace w/o regenerating the dispatcher. Having those big number of no-op slots should not be a big deal from what I understand and kernel can optimize it. Sure, chaining decisions may be different but it may be abstracted to a separate freplace-able step that may get the action (XDP_PASS, XDP_DROP, etc) from the XDP program, pass it to the next, decision-making function that by default proceeds only if previous program returned XDP_PASS, but user can freplace it with whatever logic is needed. And yeah, regenerating the dispatcher would need a new kernel API to be added to learn what ext-programs are currently attached to the dispatcher (that I don't know how big the deal is). This is the main thing so far, I'll likely provide more feedback when have some more time to read the code .. > --- > > Toke Høiland-Jørgensen (1): > libxdp: Add libxdp (FOR COMMENT ONLY) > > > tools/lib/xdp/libxdp.c | 856 +++++++++++++++++++++++++++++++++++++++ > tools/lib/xdp/libxdp.h | 38 ++ > tools/lib/xdp/prog_dispatcher.h | 17 + > tools/lib/xdp/xdp-dispatcher.c | 178 ++++++++ > tools/lib/xdp/xdp_helpers.h | 12 + > 5 files changed, 1101 insertions(+) > create mode 100644 tools/lib/xdp/libxdp.c > create mode 100644 tools/lib/xdp/libxdp.h > create mode 100644 tools/lib/xdp/prog_dispatcher.h > create mode 100644 tools/lib/xdp/xdp-dispatcher.c > create mode 100644 tools/lib/xdp/xdp_helpers.h > -- Andrey Ignatov