On Sun, Nov 22, 2020 at 12:01:45PM +0100, Pablo Neira Ayuso wrote: > Hi Alexei, > > On Sat, Nov 21, 2020 at 07:24:24PM -0800, Alexei Starovoitov wrote: > > On Sat, Nov 21, 2020 at 10:59 AM Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > > > > We're lately discussing more and more usecases in the NFWS meetings > > > where the egress can get really useful. > > > > We also discussed in the meeting XYZ that this hook is completely pointless. > > Got the hint? > > No need to use irony. > > OK, so at this point it's basically a bunch of BPF core developers > that is pushing back on these egress support series. > > The BPF project is moving on and making progress. Why don't you just > keep convincing more users to adopt your solution? You can just > provide incentives for them to adopt your software, make more > benchmarks, more documentation and so on. That's all perfectly fine > and you are making a great job on that field. > > But why you do not just let us move ahead? > > If you, the BPF team and your users, do not want to use Netfilter, > that's perfectly fine. Why don't you let users choose what subsystem > of choice that they like for packet filtering? > > I already made my own mistakes in the past when I pushed back for BPF > work, that was wrong. It's time to make peace and take this to an end. Please consider using bpf egress for what you want to accomplish. k8s networking is a great goal. It's challenging, since it demands more from the kernel than the existing set of hardcoded features provide. Clearly you cannot solve it with in-kernel iptables/nft and have to use out-of-tree kernel modules that plug into netfilter hooks. The kernel community always had and always will have a basic rule that the kernel does not add APIs for out-of-tree projects. That's why the kernel is so successful. The developers have to come back to the kernel community. nft egress hook is trying to cheat its way in by arguing its usefulness for some hypothetical case. If it was not driven by out-of-tree kernel module I wouldn't have any problem with it. nft egress is not a normal path of kernel development. It's a missing hook for out-of-tree module. That's why it stinks so much. So please consider augmenting your nft k8s solution with a tiny bit of bpf. bpf can add a new helper to call into nf_hook_slow(). The helper would be equivalent to "int nf_hook_ingress(struct sk_buff *skb)" function. With tiny bpf prog you'll be able to delegate skb processing to nft everywhere where bpf sees an skb. That's a lot more places than tc-egress.