Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Wed, 4 Nov 2020 12:43:38 -0800

On Wed, Nov 4, 2020 at 11:17 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>
> On Wed, 4 Nov 2020 14:12:47 +0100 Daniel Borkmann wrote:
> > If we would have done lib/bpf.c as a dynamic library back then, we wouldn't be
> > where we are today since users might be able to start consuming BPF functionality
> > just now, don't you agree? This was an explicit design choice back then for exactly
> > this reason. If we extend lib/bpf.c or import libbpf one way or another then there
> > is consistency across distros and users would be able to consume it in a predictable
> > way starting from next major releases. And you could start making this assumption
> > on all major distros in say, 3 months from now. The discussion is somehow focused
> > on the PoV of /a/ distro which is all nice and good, but the ones consuming the
> > loader shipping software /across/ distros are users writing BPF progs, all I'm
> > trying to say is that the _user experience_ should be the focus of this discussion
> > and right now we're trying hard making it rather painful for them to consume it.

This! Thanks, Daniel, for stating it very explicitly. Earlier I
mentioned iproute2 code simplification if using submodules, but that's
just a nice by-product, not the goal, so I'll just ignore that. I'll
try to emphasize the end user experience though.

What users writing BPF programs can expect from iproute2 in terms of
available BPF features is what matters. And by not enforcing a
specific minimal libbpf version, iproute2 version doesn't matter all
that much, because libbpf version that iproute2 ends up linking
against might be very old.

There was a lot of talk about API stability and backwards
compatibility. Libbpf has had a stable API and ABI for at least 1.5
years now and is very conscious about that when adding or extending
new APIs. That's not even a factor in me arguing for submodules. I'll
give a few specific examples of libbpf API not changing at all, but
how end user experience gets tremendously better.

Some of the most important APIs of libbpf are, arguably,
bpf_object__open() and bpf_object__load(). They accept a BPF ELF file,
do some preprocessing and in the end load BPF instructions into the
kernel for verification. But while API doesn't change across libbpf
versions, BPF-side code features supported changes quite a lot.

1. BTF sanitization. Newer versions of clang would emit a richer set
of BTF type information. Old kernels might not support BTF at all (but
otherwise would work just fine), or might not support some specific
newer additions to BTF. If someone was to use the latest Clang, but
outdated libbpf and old kernel, they would have a bad time, because
their BPF program would fail due to the kernel being strict about BTF.
But new libbpf would "sanitize" BTF, according to supported features
of the kernel, or just drop BTF altogether, if the kernel is that old.

If iproute2's latest version doesn't imply the latest libbpf version,
there is a high chance that the user's BPF program will fail to load.
Which requires users to be **aware** of all these complications, and
care about specific Clang versions and subsets of BTF that get
generated. With the latest libbpf all that goes away.

2. bpf_probe_read_user() falling back to bpf_probe_read(). Newer
kernels warn if a BPF application isn't using a proper _kernel() or
_user() variant of bpf_probe_read(), and eventually will just stop
supporting generic bpf_probe_read(). So what this means is that end
users would need to compile to variants of their BPF application, one
for older kernels with bpf_probe_read(), another with
bpf_probe_read_kernel()/bpf_probe_read_user(). That's a massive pain
in the butt. But newer libbpf versions provide a completely
transparent fallback from _user()/_kernel() variants to generic one,
if the kernel doesn't support new variants. So the instruction to
users becomes simple: always use
bpf_probe_read_user()/bpf_probe_read_kernel().

But with iproute2 not enforcing new enough versions of libbpf, all
that goes out of the window and puts the burden back on end users.

3. Another feature (and far from being the last of this kind in
libbpf) is a full support for individual *non-always-inlined*
functions in BPF code, which was added recently. This allows to
structure BPF code better, get better instruction cache use and for
newer kernels even get significant speed ups of BPF code verification.
This is purely a libbpf feature, no API was changed. Further, the
kernel understands the difference between global and static functions
in BPF code and optimizes verification, if possible. Libbpf takes care
of falling back to static functions for old kernels that are not yet
aware of global functions. All that is completely transparent and
works reliably without users having to deal with three variants of
doing helper functions in their BPF code.

And again, if, when using iproute2, the user doesn't know which
version of libbpf will be used, they have to assume the worst
(__always_inline) or maintain 2 or 3 different copies of their code.

And there are more conveniences like that significantly simplifying
BPF end users by hiding differences of kernel versions, clang
versions, etc.

Submodule is a way that I know of to make this better for end users.
If there are other ways to pull this off with shared library use, I'm
all for it, it will save the security angle that distros are arguing
for. E.g., if distributions will always have the latest libbpf
available almost as soon as it's cut upstream *and* new iproute2
versions enforce the latest libbpf when they are packaged/released,
then this might work equivalently for end users. If Linux distros
would be willing to do this faithfully and promptly, I have no
objections whatsoever. Because all that matters is BPF end user
experience, as Daniel explained above.

>
> IIUC you're saying that we cannot depend on libbpf updates from distro.

As I tried to explain above, a big part of libbpf is BPF loader,
which, while not changing the library API, does get more and advanced
features with newer versions. So yeah, you can totally use older
versions of libbpf, but you need to be aware of all the kernel + clang
+ BPF code features interactions, which newer libbpfs often
transparently alleviate for the user.

So if someone has some old BPF code not using anything fancy, they
might not care all that much, probably.

> Isn't that a pretty bad experience for all users who would like to link
> against it? There are 4 components (kernel, lib, tools, compiler) all
> need to be kept up to date for optimal user experience. Cutting corners
> with one of them leads nowhere medium term IMHO.
>
> Unless what you guys are saying is that libbpf is _not_ supposed to be
> backward compatible from the user side, and must be used a submodule.
> But then why bother defining ABI versions, or build it as an .so at all.

That's not what anyone is saying, I hope we established that in this
thread that libbpf does provide a stable API and ABI, with backwards
and forward compatibility. And takes it very seriously. User BPF
programs just tend to grow in complexity and features used and newer
libbpf versions are sometimes a requirement to utilize all that
effectively.

>
> I'm also confused by the testing argument. Surely the solution is to
> add unit / system tests for iproute2. Distros will rebuild packages
> when dependencies change and retest. If we have 0 tests doesn't matter
> what update strategy there is.

Tests are good, but I'm a bit sceptical about the surface area that
could be tested. Compiled BPF program (ELF file) is an input to BPF
loader APIs, and that compiled BPF program can be arbitrarily complex,
using a variety of different kernel/libbpf features. So a single
non-changing APIs accepts an infinite variety of inputs. selftests/bpf
mandate that each new kernel and libbpf feature gets a test, I'm
wondering if iproute2 test suite would be able to keep up with this.
And then again, some features are not supposed to work on older libbpf
versions, so not clear how iproute2 would test that. But regardless,
more testing is always better, so I hope this won't discourage testing
per se.