> I see several possible areas of contention: > > 1) If you aim for a non-feature-complete support of iptables rules, it > will create confusion to the users. Right, you need full feature parity to be avoid ending up having to maintain two implementations. It seems uncontroversial that BPF can be very powerful if run at iptables hooks. For performance, but also versatility. The android folks are converting one out-of-tree module to BPF. There is probably a lot more such business logic out there that is not suitable for inclusion in mainline as an xt match/target, and that needs more access than xt_bpf can provide. If a new first-class citizen BPF infra can do this and back the legacy interface, too, that would save on maintenance. There is a steady stream of fixes to iptables, e.g., from syzkaller vulnerability reports. Just keeping the old implementation around as a dead letter is not a safe deprecation strategy. To bootstrap bpfilter, in the short term a reasonable set of iptables targets and matches can perhaps be ported to BPF external functions with some simple glue code. > To me, this looks like some kind of legacy backwards compatibility > mechanism that one would find in proprietary operating systems, but not > in Linux. iptables, libiptc etc. are all free software. The source > code can be edited, and you could just as well have a new version of > iptables and/or libiptc which would pass the ruleset in userspace to > your compiler, which would then insert the resulting eBPF program. > > Why add quite comprehensive kerne infrastructure? What's the motivation > here? The ABI deprecation point has been discussed quite a bit. If it is infeasible to just drop the old interface, then an upcall mechanism does seem the most practical approach to dynamically generating this code. FWIW, as BPF is being used in more places, other locations besides iptables could make use of this. > Could you please clarify why the 'filter' table INPUT chain was used if > you're using XDP? AFAICT they have completely different semantics. > > There is a well-conceived and generally understood notion of where > exactly the filter/INPUT table processing happens. And that's not as > early as in the NIC, but it's much later in the processing of the > packet. > > I believe _if_ one wants to use the approach of "hiding" eBPF behind > iptables, then either > > a) the eBPF programs must be executed at the exact same points in the > stack as the existing hooks of the built-in chains of the > filter/nat/mangle/raw tables, or > > b) you must introduce new 'tables', like an 'xdp' table which then has > the notion of processing very early in processing, way before the > normal filter table INPUT processing happens. Agreed. One of the larger issues in the conversion of the Android qtaguid conversion was the state surrounding the skb at the time of processing. This example primarily depended on having skb->sk set. Whether that is available at tc depends on early decap and even when set the sk might prove different from the final one in the socket layer in edge cases. Just one example how moving the call site can be very fragile wrt state. Another issue wrt moving around is availability of external functions at different layers. XDP has access to far fewer than TC. For iptables, I would imagine that you either want parity with TC or even a new independent type. Parity would be useful also to expose some xt_match functionality at the TC layer that is currently missing there. > My main points are: > > 1) What is the goal of this? My high bit feedback: for cases like taguid, it is very useful to be able to execute BPF as drop-in at existing iptables locations, as is having various match and target functionality available from BPF. Maintaining the legacy ABI is basically dictated. If this can be achieved while optimizing the runtime path and reducing maintenance that is very appealing. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html