Re: [PATCH RFC bpf-next 00/10] bpf: CO-RE support in the kernel.

Lorenz Bauer <lmb@xxxxxxxxxxxxxx> · Tue, 28 Sep 2021 09:30:23 +0100

On Mon, 27 Sept 2021 at 17:50, Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Sep 27, 2021 at 05:12:15PM +0100, Lorenz Bauer wrote:
> > On Sat, 25 Sept 2021 at 00:13, Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Thu, Sep 23, 2021 at 12:33:58PM +0100, Lorenz Bauer wrote:
> > > >
> > > > Some questions:
> > > > * How can this handle kernels that don't have built-in BTF? Not a
> > > > problem for myself, but some people have to deal with BTF-less distro
> > > > kernels by using pahole to generate external BTF from debug symbols.
> > > > Can we accommodate that?
> > >
> > > I think so, but it probably should be done as a generic feature:
> > > "populate kernel BTF".
> > > When kernel wasn't compiled with BTF there could be a way to
> > > populate it with such. Just like we do sys_bpf(BTF_LOAD)
> > > for program's BTF we can allow populating vmlinux BTF this way.
> > > Unlike builtin BTF it wouldn't be trusted for certain verifier assumptions,
> > > but better than nothing and more convenient than specifying BTF file
> > > on a side for every bpf prog load with traditional libbpf style.
> >
> > From my POV we already have an API for external BTF (and I think
> > libbpf does too?) but would need a new API for "load kernel BTF".
> > Global state like this also doesn't work well for several individual
> > processes. Imagine multiple programs on the system trying to each
> > replace the kernel BTF, how would that work? Which one wins?
>
> The kernel BTF can be only one, of course.
> I don't expect progs to update the kernel BTF when they start.
> It's more of the admin/chef job when kernel boots.
> Only for the cases when kernel somehow was compiled without BTF.
>
> > Being
> > able to give my own fd for kernel BTF circumvents all those problems
> > and seems much cleaner to me.
>
> You mean to pass kernel BTF's fd to the kernel?
> It's doable, but I don't quite see the operational side of it.
> If progs have to carry both their BTF and kernel BTF why would
> they need CO-RE at all? If they were compiled with given kernel BTF
> there is no need to adjust offsets for the given host.
> I suspect I simply don't understand your use case :)

This is the "distro ships without BTF" case that the aqua sec folks
have been grappling with, and for which btfhub is a solution. If the
distro disables BTF they are unlikely to perform this "admin" job in
the first place. So whose responsibility is it to load that BTF?
Currently it falls on the developers of the user space tooling to
provide alternative BTF. Only allowing a single replacement BTF makes
this difficult.

Here is why:
* Since external BTF is a thing, loaders today have to provide a way
to relocate against external BTF in a non-standard location. This
means loading the file from disk and then performing CO-RE using that
info.
* Users of the loader build a btfhub integration (or similar) and
provide a path to the external BTF during load. They do this because
they will have to support legacy kernels for years to come.
* Under my proposal, a loader can detect whether in-kernel CO-RE is
supported, load the BTF provided by the user into the kernel, and pass
that fd to PROG_LOAD.
* This is transparent to the user: they keep using their existing BTF
but get the benefit of canonical CO-RE resolution.

We don't have to introduce a new loader-side API to deal with this
situation. We also don't have to deal with a global resource that is
subject to the whims of the distro.

Lorenz

-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com