Re: [RFC] bpf: add bpf_link support for BPF_NETFILTER programs

Toke Høiland-Jørgensen <toke@xxxxxxxxxx> · Mon, 30 Jan 2023 22:10:45 +0100

Florian Westphal <fw@xxxxxxxxx> writes:

> Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>> > Is BPF_LINK the right place?  Hook gets removed automatically if the calling program
>> > exits, afaict this is intended.
>> 
>> Yes, this is indeed intended for bpf_link. This plays well with
>> applications that use the API and stick around (because things get
>> cleaned up after them automatically even if they crash, say), but it
>> doesn't work so well for programs that don't (which, notably, includes
>> command line utilities like 'nft').
>
> Right, but I did not want to create a dependency on nfnetlink or
> nftables netlink right from the start.

Dependency how? For userspace consumers, you mean?

>> For XDP and TC users can choose between bpf_link and netlink for
>> attachment and get one of the two semantics (goes away on close or stays
>> put). Not sure if it would make sense to do the same for nftables?
>
> For nftables I suspect that, if nft can emit bpf, it would make sense to
> pass the bpf descriptor together with nftables netlink, i.e. along with
> the normal netlink data.
>
> nftables kernel side would then know to use the bpf prog for the
> datapath instead of the interpreter and could even fallback to
> interpreter.
>
> But for the raw hook use case that Alexei and Daniel preferred (cf.
> initial proposal to call bpf progs from nf_tables interpreter) I think
> that there should be no nftables dependency.
>
> I could add a new nfnetlink subtype for nf-bpf if bpf_link is not
> appropriate as an alternative.

I don't think there's anything wrong with bpf_link as an initial
interface at least. I just think it should (eventually) be possible to
load a BPF-based firewall from the command line via this interface,
without having to resort to pinning. There was some talk about adding
this as a mode to the bpf_link interface itself at some point, but that
never materialised (probably because the need is not great since the
netlink interface serves this purpose for TC/XDP).

>> > Things like nf_hook_state->in (net_device) could then be exposed via
>> > kfuncs.
>> 
>> Right, so like:
>> 
>> state = bpf_nf_get_hook_state(ctx); ?
>> 
>> Sounds OK to me.
>
> Yes, something like that.  Downside is that the nf_hook_state struct
> is no longer bound by any abi contract, but as I understood it thats
> fine.

Well, there's an ongoing discussion about what, if any, should be the
expectations around kfunc stability:

https://lore.kernel.org/r/20230117212731.442859-1-toke@xxxxxxxxxx

I certainly don't think it's problematic for a subsystem to give *more*
stability guarantees than BPF core. I mean, if you want the kfunc
interface to be stable you just... don't change it? :)

>> > nf_hook_run_bpf() (c-function that creates the program context and
>> > calls the real bpf prog) would be "updated" to use the bpf dispatcher to
>> > avoid the indirect call overhead.
>> 
>> What 'bpf dispatcher' are you referring to here? We have way too many
>> things with that name :P
>
> I meant 'DEFINE_BPF_DISPATCHER(nf_user_progs);'

Ah, right. Yeah, that can definitely be added later!

-Toke