On 17/10/2019 13:11, Toke Høiland-Jørgensen wrote: > I think there's a conceptual disconnect here in how we view what an XDP > program is. In my mind, an XDP program is a stand-alone entity tied to a > particular application; not a library function that can just be inserted > into another program. To me, an XDP (or any other eBPF) program is a function that is already being 'inserted into another program', namely, the kernel. It's a function that's being wired up to a hook in the kernel. Which isn't so different to wiring it up to a hook in a function that's wired up to a hook in the kernel (which is what my proposal effectively does). > Setting aside that for a moment; the reason I don't think this belongs > in userspace is that putting it there would carry a complexity cost that > is higher than having it in the kernel. Complexity in the kernel is more expensive than in userland. There are several reasons for this, such as: * The kernel's reliability requirements are stricter — a daemon that crashes can be restarted, a kernel that crashes ruins your day. * Userland has libraries available for many common tasks that can't be used in the kernel. * Anything ABI-visible (which this would be) has to be kept forever even if it turns out to be a Bad Idea™, because We Do Not Break Userspace™. The last of these is the big one, and means that wherever possible the proper course is to prototype functionality in userspace, and then once the ABI is solid and known-useful, it can move to the kernel if there's an advantage to doing so (typically performance). Yes, that means applications may have to change twice (though hopefully just a matter of building against a new libbpf), but the old applications can be kept working (by keeping the daemon around on such systems). > Specifically, if we do implement > an 'xdpd' daemon to handle all this, that would mean that we: > > - Introduce a new, separate code base that we'll have to write, support > and manage updates to. Separation is a good thing. Whichever way we do this, we have to write some new code. Having that code _outside_ the kernel tree helps to keep our layers separate. Chain calling is a layering violation! > - Add a new dependency to using XDP (now you not only need the kernel > and libraries, you'll also need the daemon). You'll need *a* daemon. You won't be tied to a specific implementation. And if you're just developing, you won't even need that — you can still bind a prog directly to the device if you have the ackles — so it's only for application deployment that it's needed. By the time you're at the point of deploying an application that people are going to be installing with "yum install myFirewall", you have the whole package manager dependency resolution system to deal with the daemon. > - Have to duplicate or wrap functionality currently found in the kernel; > at least: > > - Keeping track of which XDP programs are loaded and attached to > each interface There's already an API to query this. You would probably want an atomic cmpxchg operation, so that you can detect if someone else is fiddling with XDP and scream noisy warnings. > (as well as the "new state" of their attachment order). That won't be duplicate, because it won't be in the kernel. The kernel will only ever see one blob and it doesn't know or care how userland assembled it. > - Some kind of interface with the verifier; if an app does > xdpd_rpc_load(prog), how is the verifier result going to get back > to the caller? The daemon will get the verifier log back when it tries to update the program; it might want to do a bit of translation before passing it on, but an RPC call can definitely return errors to the caller. In the Ideal World of kernel dynamic linking, of course, each app prog gets submitted to the verifier by the app to create a floating function in the kernel that's not bound to any XDP hook (app gets its verifier responses at this point) and then the app just sends an fd for that function to the daemon; at that point any verifier errors after linking are the fault of the daemon and its master program. Thus the Ideal World doesn't need any kind of translation of verifier output to make it match up with individual app's program. > - Have to deal with state synchronisation issues (how does xdpd handle > kernel state changing from underneath it?). The cmpxchg I mentioned above would help with that. > While these are issues that are (probably) all solvable, I think the > cost of solving them is far higher than putting the support into the > kernel. Which is why I think kernel support is the best solution :) See my remarks above about kernel ABIs. Also, chain calling and the synchronisation dance between apps still looks needlessly complex and fragile to me — it's like you're having the kernel there to be the central point of control and then not actually having a central point of control after all. (But if chain calling does turn out to be the right API, well, the daemon can always implement that!) -Ed