Re: [PATCH RFC bpf-next 0/3] libbpf: Add support for extern function calls

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Fri, 20 Dec 2019 12:30:47 -0800

On Thu, Dec 19, 2019 at 03:29:30PM +0100, Toke Høiland-Jørgensen wrote:
> This series adds support for resolving function calls to functions marked as
> 'extern' in eBPF source files, by resolving the function call targets at load
> time. For now, this only works by static linking (i.e., copying over the
> instructions from the function target. Once the kernel support for dynamic
> linking lands, support can be added for having a function target be an already
> loaded program fd instead of a bpf object.
> 
> The API I'm proposing for this is that the caller specifies an explicit mapping
> between extern function names and function names in the target object file.
> This is to support the XDP multi-prog case, where the dispatcher program may not
> necessarily have control over function names in the target programs, so simple
> function name resolution can't be used.

I think simple name resolution should be the default behavior for both static
and dynamic linking. That's the part where I think we must not reinvent the wheel.
When one .c has
extern int prog1(struct xdp_md *ctx);
another .c should have:
int prog1(struct xdp_md *ctx) {...}
Both static and dynamic linking should link these two .c together without any
extra steps from the user. It's expected behavior that any C user assumes and
it should 'just work'.

Where we need to be creative is how plug two xdp firewalls with arbitrary
program names (including the same names) into common roolet.

One firewall can be:
noinline int foo(struct xdp_md *ctx)
{ // some logic
}
SEC("xdp")
int xdp_prog1(struct xdp_md *ctx)
{
       return foo(ctx);
}

And another firewall:
noinline int foo(struct xdp_md *ctx)
{ // some other logic
}
SEC("xdp")
int xdp_prog2(struct xdp_md *ctx)
{
       return foo(ctx);
}

Both xdp programs (with multiple functions) need to be connected into:

__weak noinline int dummy1(struct xdp_md *ctx) { return XDP_PASS; }
__weak noinline int dummy2(struct xdp_md *ctx) { return XDP_PASS; }

SEC("xdp")
int rootlet(struct xdp_md *ctx)
{
        int ret;

        ret = dummy1(ctx);
        if (ret != XDP_PASS)
                goto out;

        ret = dummy2(ctx);
        if (ret != XDP_DROP)
                goto out;
out:
        return ret;
}

where xdp_prog1() from 1st firewall needs to replace dummy1()
and xdp_prog2() from 2nd firewall needs to replaced dummy2().
Or the other way around depending on the order of installation.

At the kernel level the API is actually simple. It's the pair of
target_prog_fd + btf_id I described earlier in "static/dynamic linking" thread.
Where target_prog_fd is FD of loaded into kernel rootlet and
btf_id is BTF id of dummy1 or dummy2.

When 1st firewall is being loaded libbpf needs to pass target_prog_fd+btf_id
along with xdp_prog1() into the kernel, so that the verifier can do
appropriate checking and refcnting.

Note that the kernel and every program have their own BTF id space.
Their own BTF ids == their own names.
Loading two programs with exactly the same name is ok today and in the future.
Linking into other program name space is where we need to agree on naming first.

The static linking of two .o should follow familiar user space linker logic.
Proposed bpf_linker__..("first.o") and bpf_linker__..("second.o") should work.
Meaning that "extern int foo()" in "second.o" will get resolved with "int foo()"
from "first.o".
Dynamic linking is when "first.o" with "int foo()" was already loaded into
the kernel and "second.o" is loaded after. In such case its "extern int foo()"
will be resolved dynamically from previously loaded program.
The user space analogy of this behavior is glibc.
"first.o" is glibc.so that supplies memcpy() and friends.
"second.o" is some a.out that used "extern int memcpy()".

For XDP rootlet case already loaded weak function dummy[12]() need to
be replaced later by xdp_prog[12](). It's like replacing memcpy() in glibc.so.
I think the user space doesn't have such concepts. I was simply calling it
dynamic linking too, but it's not quite accurate. It's dynamically replacing
already loaded functions. Let's call it "dynamic re-linking" ?

As far as libbpf api for dynamic linking, so far I didn't need to add new stuff.
I'm trying to piggy back on fexit/fentry approach.

I think to prototype re-linking without kernel support. We can do static re-linking.
I think the best approach is to stick with name based resolution. libxdp can do:
- add alias("dummy1") to xdp_prog1() in first_firewall.o
- rename foo() in first_firewall.o into unique_foo().
- add alias("dummy2") to xdp_prog2() in second_firewall.o
- rename foo() in second_firewall.o into very_unique_foo().
- use standard static linking of first_firewall.o + second_firewall.o + rootlet.o

The static re-linking is more work than dynamic re-linking because it needs to
operate in a single name space of final .o. Whereas dynamic re-linking has
individual name space for every loaded into kernel program.
I'm hoping to share a prototype of dynamic re-linking soon.