Hello Arturo, Thanks a lot for your reply, my ultimate goal is to develop kube-proxy which is building nftables rules instead of iptables, in addition the goal is to use direct API calls to netlink without any external dependencies and of course to try to leverage nftables' advanced features to achieve the best performance. I am in the process of identifying gaps in functionality available in github.com/google/nftables and github.com/sbezverk/nftableslib libraries, example yesterday I found out that neither of these libraries supports "numgen", which would be a mandatory feature to support load balancing between service's multiple end points. I will have to add it to both to be able to move forward. I use iptables from a working cluster and try to build a code which would program nftables the same way (with optimization). Once it is done, then it can be arranged into a controller listening for svc/endpoints and program into nftables accordingly. I am looking for people interested in the same topic to be able to discuss different approaches, like it was done yesterday with Phil and select the best approach to make nftables to shine ( Please let me know if you are interested in further discussions. Thank you Serguei On 2019-11-27, 5:12 AM, "Arturo Borrero Gonzalez" <arturo@xxxxxxxxxxxxx> wrote: On 11/26/19 10:20 PM, Serguei Bezverkhi (sbezverk) wrote: > On Tue, Nov 26, 2019 at 06:47:09PM +0000, Serguei Bezverkhi (sbezverk) wrote: > > Ok, I guess I will work around by using input and output chain types, even though it will raise some brows in k8s networking community. > > @Sergei, thanks for reaching out about this topic. I'm using k8s a lot lately and would be interested in knowing more about what you are trying to do with kubernetes and nftables. In any case, if the somebody in kubernetes is planning to introduce nft for kube-proxy or other component, I would suggest the generated ruleset is validated here to really benefit from nftables. Is this what you are doing, right? Recently I had the chance to attend a talk by @Laura (in CC) about the iptables ruleset generated by docker and kube-proxy. Such rulesets are the opposite of something meant to scale and perform well. Then people compare such rulesets with other networking setups... and unfair compare. Worth mentioning at this point this PoC too: https://github.com/zevenet/kube-nftlb Trying to mimic 1:1 what iptables was doing is a mistake from my point of view. I believe you are aware of this already :-) > > Keeping both target address and port in a single map for *NAT statements > is not possible AFAIK. @Phil, I think it is possible! examples in the wiki: https://wiki.nftables.org/wiki-nftables/index.php/Multiple_NATs_using_nftables_maps It would be something like: % nft add rule nat prerouting dnat \ tcp dport map { 1000 : 1.1.1.1, 2000 : 2.2.2.2, 3000 : 3.3.3.3} \ : tcp dport map { 1000 : 1234, 2000 : 2345, 3000 : 3456 } > > If I'm not mistaken, you might be able to hook up a vmap together with > the numgen expression above like so: > > | numgen random mod 0x2 vmap { \ > | 0x0: jump KUBE-SEP-FS3FUULGZPVD4VYB, \ > | 0x1: jump KUBE-SEP-MMFZROQSLQ3DKOQA } > > Pure speculation, though. :) > This works indeed. Just added the example to the wiki: https://wiki.nftables.org/wiki-nftables/index.php/Load_balancing#Round_Robin