Re: How to orchestrate multiple XDP programs

Toke Høiland-Jørgensen <toke@xxxxxxxxxx> · Tue, 23 Feb 2021 12:07:04 +0100

"Brian G. Merrell" <brian.g.merrell@xxxxxxxxx> writes:

> On 21/02/22 11:41PM, Toke Høiland-Jørgensen wrote:
>> "Brian G. Merrell" <brian.g.merrell@xxxxxxxxx> writes:
>> > On 21/02/18 05:20PM, Toke Høiland-Jørgensen wrote:
>
>> >> No, I think the main difference is that in the model you described,
>> >> you're assuming that your orchestration system would install the XDP
>> >> program on behalf of the application as well as launch the userspace
>> >> bits.
>> >
>> > Yes, that's right. This is the model we are implementing.
>> >
>> >> Whereas I'm assuming that an application that uses XDP will start
>> >> in userspace (launched by systemd, most likely), and will then load its
>> >> own XDP program after possibly doing some initialisation first (e.g.,
>> >> pre-populating maps, that sort of thing).
>> >> 
>> >> From what I've understood from what you explained about your setup, your
>> >> model could work with both models as well; so why are you assuming that
>> >> applications won't want to install their own XDP programs? :)
>> >
>> > I would just say that in our organizations network and administration
>> > environment, we ideally want a centralized orchestration tooling and
>> > control plane that is used for all XDP (and tc) programs running on our
>> > machines with our model described above.
>> 
>> Right, sure, I'm not disputing this model is useful as well, I'm just
>> wondering about how you envision the details working. Say your
>> orchestration system installs an XDP program on behalf of an application
>> and then launches the userspace component (assuming one exists). How is
>> that userspace program supposed to obtain a file descriptor for the
>> map(s) used by the XDP program in order to communicate with it?
>
> OK, so this part is admittedly a little hand-wavy and a work in
> progress. We're literally working on design and proof of concepts right
> now, but this is basically what we're envisioning:
>
> 1. Orchestration tool gets all its JSON config data, which includes
>    remote paths for BPF programs and any respective userspace
>    programs.
> 2. Orchestration tool downloads BPF programs and loads them (using
>    Go libxdp when it's available). Then (and this is where I'm going to
>    start waving my hands) the orchestrator will need to gather any
>    necessary map names/ids/fds information to be send to the userspace
>    program. I'm just not exactly sure how easy/hard/possible this part
>    is.
> 3. We start the userspace programs as separate processes and communicate
>    with them via RPC (there's a nice Go plugin system for this[1]). Each
>    userspace program implements an interface and we communicate the map
>    info (among other things) over RPC to the userspace program when it
>    starts.
>
> I'm going to continue researching and fleshing out the details, but are
> there any obvious problems with this approach?

I think the basic idea can work (it's similar to systemd's socket
activation, which also passes the socket fd to the userspace process on
launch). However, there are a couple of things that become impossible
for the userspace process to do in this model:

- Modifying the BPF object before load: Libbpf does quite a few
  transformations on the bytecode to handle relocations, and it's also
  going to grow a full linker at some point. This is not a problem if
  the userspace program just lets libbpf do the default thing, but if it
  wants to customise the operations it becomes a problem. The obvious
  use case for this that comes to mind is dynamically omitting parts of
  the code for features that are not enabled (like we do in xdp-filter).

- Populating maps before load: This is necessary to use customised
  'const' global variables: The map backing these are frozen on load to
  allow the verifier to make strong assumptions about their content, so
  you can't modify them after the map is loaded.

- Atomic map population: Say you have an XDP program that reacts to
  traffic steering rules, and you start out with the program being
  attached to the interface and an empty map that userspace then has to
  populate. While the map is being populated, the XDP program will
  process some packets with an incomplete view of the final ruleset.
  Whereas if you can populate the map completely before attaching the
  program you can be sure that it's consistent. Depending on the nature
  of the application, this may lead to weird effects, or it may be
  mostly harmless.

Now, all of these could in principle be performed by the orchestrator on
behalf of the program, but that means you'll have to make the
orchestrator more complex, and you'll have to come up with a way to
express these operations in your configuration language.

> A backup plan is to have the userspace programs do the loading of the
> BPF program, but it's not obvious to me how that would be easier to
> obtain the file descriptor for the map(s) vs. having the orchestrator
> figure it out and send it to the userspace process.

Both approaches carry complexities with it, and I'm not sure there
really is a universal right answer. What tipped the scales for me were
the issues above. However, being in a more controlled environment, the
trade off may well be different for you.

> If it works out that the orchestrator can load the BPF programs on
> behalf of the userspace programs, then I think the primary benefit is
> that the developer of the userspace program doesn't need to follow
> some boilerplate to load the appropriate way--we've done all that for
> them. It seems nice that the orchestrator could be the one interface
> with libxdp (for the XDP case) without every userspace program needing
> to doing it's own adding/removing (and thus dispatcher swapping),
> though I would guess that's not really a problem at all.

Yeah, I do agree that it would be nicer if there was a clean interface
the application could talk to without having to muck about with
dispatchers. My hope is that by encapsulating all that in a library we
can pretend that there is :)

However, I can see how this perspective may also be different in a Go
world: With a C library in a distro we can ship the library as a
separate package and as long as we maintain ABI compatibility, we can
upgrade the library independently of the applications. Whereas with Go's
vendoring approach it becomes way harder to ensure that all applications
use the "right" version of the library. In that sense, a C library is
more of a "system service" whereas a Go library is more of an
"application sub-function".

> I feel like I've gone out of the scope of libxdp in this e-mail, but
> you did ask :) And I do appreciate any feedback or raising of red
> flags.

Only slightly :)
As outlined above all of this has gone into my thinking when designing
libxdp, so I appreciate the chance to get your perspectives on these
very real tradeoffs. So thanks again for taking the time to explain your
thought process!

-Toke