Florian Westphal <fw@xxxxxxxxx> writes: > Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> > Is BPF_LINK the right place? Hook gets removed automatically if the calling program >> > exits, afaict this is intended. >> >> Yes, this is indeed intended for bpf_link. This plays well with >> applications that use the API and stick around (because things get >> cleaned up after them automatically even if they crash, say), but it >> doesn't work so well for programs that don't (which, notably, includes >> command line utilities like 'nft'). > > Right, but I did not want to create a dependency on nfnetlink or > nftables netlink right from the start. Dependency how? For userspace consumers, you mean? >> For XDP and TC users can choose between bpf_link and netlink for >> attachment and get one of the two semantics (goes away on close or stays >> put). Not sure if it would make sense to do the same for nftables? > > For nftables I suspect that, if nft can emit bpf, it would make sense to > pass the bpf descriptor together with nftables netlink, i.e. along with > the normal netlink data. > > nftables kernel side would then know to use the bpf prog for the > datapath instead of the interpreter and could even fallback to > interpreter. > > But for the raw hook use case that Alexei and Daniel preferred (cf. > initial proposal to call bpf progs from nf_tables interpreter) I think > that there should be no nftables dependency. > > I could add a new nfnetlink subtype for nf-bpf if bpf_link is not > appropriate as an alternative. I don't think there's anything wrong with bpf_link as an initial interface at least. I just think it should (eventually) be possible to load a BPF-based firewall from the command line via this interface, without having to resort to pinning. There was some talk about adding this as a mode to the bpf_link interface itself at some point, but that never materialised (probably because the need is not great since the netlink interface serves this purpose for TC/XDP). >> > Things like nf_hook_state->in (net_device) could then be exposed via >> > kfuncs. >> >> Right, so like: >> >> state = bpf_nf_get_hook_state(ctx); ? >> >> Sounds OK to me. > > Yes, something like that. Downside is that the nf_hook_state struct > is no longer bound by any abi contract, but as I understood it thats > fine. Well, there's an ongoing discussion about what, if any, should be the expectations around kfunc stability: https://lore.kernel.org/r/20230117212731.442859-1-toke@xxxxxxxxxx I certainly don't think it's problematic for a subsystem to give *more* stability guarantees than BPF core. I mean, if you want the kfunc interface to be stable you just... don't change it? :) >> > nf_hook_run_bpf() (c-function that creates the program context and >> > calls the real bpf prog) would be "updated" to use the bpf dispatcher to >> > avoid the indirect call overhead. >> >> What 'bpf dispatcher' are you referring to here? We have way too many >> things with that name :P > > I meant 'DEFINE_BPF_DISPATCHER(nf_user_progs);' Ah, right. Yeah, that can definitely be added later! -Toke