Hi Florian, On 02/16/2018 05:14 PM, Florian Westphal wrote: > Florian Westphal <fw@xxxxxxxxx> wrote: >> Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: >> Several questions spinning at the moment, I will probably come up with >> more: > > ... and here there are some more ... > > One of the many pain points of xtables design is the assumption of 'used > only by sysadmin'. > > This has not been true for a very long time, so by now iptables has > this userspace lock (yes, its fugly workaround) to serialize concurrent > iptables invocations in userspace. > > AFAIU the translate-in-userspace design now brings back the old problem > of different tools overwriting each others iptables rules. Right, so the behavior would need to be adapted to be exactly the same, given all the requests go into kernel space first via the usual uapis, I don't think there would be anything in the way of keeping that as is. > Another question -- am i correct in that each rule manipulation would > incur a 'recompilation'? Or are there different mini programs chained > together? Right now in the PoC yes, basically it regenerates the program on the fly in gen.c when walking the struct bpfilter_ipt_ip's and appends the entries to the program, but it doesn't have to be that way. There are multiple options to allow for a partial code generation, e.g. via chaining tail call arrays or directly via BPF to BPF calls eventually, there would be few changes on BPF side needed, but it can be done; there could additionally be various optimizations passes during code generation phase performed while keeping given constraints in order to speed up getting to a verdict. > One of the nftables advantages is that (since rule representation in > kernel is black-box from userspace point of view) is that the kernel > can announce add/delete of rules or elements from nftables sets. > > Any particular reason why translating iptables rather than nftables > (it should be possible to monitor the nftables changes that are > announced by kernel and act on those)? Yeah, correct, this should be possible as well. We started out with the iptables part in the demo as the majority of bigger infrastructure projects all still rely heavily on it (e.g. docker, k8s to just name two big ones). Usually they have their requests to iptables baked into their code directly which probably won't change any time soon, so thought was that they could benefit initially from it once there would be sufficient coverage. Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html