Hello. I have been working on an nftables backend as alternative (or replacement?) for the libiptc one. A PoC is here: https://git.breakpoint.cc/cgit/fw/systemd.git/log/?h=nft_08 Diffstat: basic/linux/netfilter/nf_tables.h | 1801 +++++++++++++++++++++++++++++++ basic/linux/netfilter/nfnetlink.h | 81 + libsystemd/meson.build | 1 libsystemd/sd-netlink/netlink-internal.h | 1 libsystemd/sd-netlink/netlink-socket.c | 26 libsystemd/sd-netlink/netlink-types.c | 234 ++++ libsystemd/sd-netlink/netlink-types.h | 16 libsystemd/sd-netlink/nfnl-message.c | 309 +++++ libsystemd/sd-netlink/sd-netlink.c | 25 network/networkd-address.c | 4 nspawn/nspawn-expose-ports.c | 6 shared/firewall-util-nft.c | 746 ++++++++++++ shared/firewall-util.c | 23 shared/firewall-util.h | 22 shared/meson.build | 2 systemd/sd-netlink.h | 25 test/test-firewall-util.c | 24 17 files changed, 3296 insertions(+), 50 deletions(-) Most of this comes from the import of nf_tables.h (cached header of kernel uapi) and the nfnetlink backend, i.e. this doesn't add a external library dependency. At this time, the prototype disables the existing libiptc backend and unconditionally uses the nft one. I did this for simplicity. This also means that the existing API (fw_add_...) is mostly the same. I say *mostly* because that API exposes more functionality (on iptables side) than is actually used, such as in/output interface names where all calles pass NULL. To simplify the prototype I modified the API to drop the 'always NULL' arguments to focus on what is actually used. Idea is to create a static ruleset, added once when first rule is added, or by a new 'init NAT facility' function. The prototype is complete enough to run the test-firewall-util. The following ruleset will be created: table ip io.systemd.nat { set masq_saddr { type ipv4_addr } map map_port_ipport { type inet_proto . inet_service : ipv4_addr . inet_service } chain prerouting { type nat hook prerouting priority filter + 1; policy accept; fib daddr type local dnat ip addr . port to meta l4proto . th dport map @map_port_ipport } chain postrouting { type nat hook postrouting priority filter + 1; policy accept; ip saddr @masq_saddr masquerade } } After that, future fw_add_masquerade/add_local_dnat will only add/delete the element/mapping to masq_saddr and map_port_ipport, respectively. The ruleset itself never changes. Running test-firewall-util with this backend gives following output on a parallel 'nft monitor': $ nft monitor add table ip io.systemd.nat add chain ip io.systemd.nat prerouting { type nat hook prerouting priority filter + 1; policy accept; } add chain ip io.systemd.nat postrouting { type nat hook postrouting priority filter + 1; policy accept; } add set ip io.systemd.nat masq_saddr { type ipv4_addr; } add map ip io.systemd.nat map_port_ipport { type inet_proto . inet_service : ipv4_addr . inet_service; } add rule ip io.systemd.nat prerouting fib daddr type local dnat ip addr . port to meta l4proto . th dport map @map_port_ipport add rule ip io.systemd.nat postrouting ip saddr @masq_saddr masquerade add element ip io.systemd.nat masq_saddr { 10.1.2.0 } add element ip io.systemd.nat masq_saddr { 10.1.2.3 } delete element ip io.systemd.nat masq_saddr { 10.1.2.0 } add element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.4 . 815 } delete element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.4 . 815 } add element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.5 . 815 } delete element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.5 . 815 } CTRL-C So, good enough for a prototype and to send it out to get feedback. Its still incomplete: 1. no output chain is added, this is needed to complete local dnat support (sd_nfnl_message_new_dnat_rule_out function doesn't work yet). 2. No ipv6 support, but this is rather easy, the current nfnetlink backend should be complete enough for this. 3. No cleanup on restart, i.e. on startup the table should be deleted when it exists, rather than re-adding the pre/postrouting rules. 4. 'set masq_saddr' should use ranges, so we can do masquarade for e.g. 10.2.3.4-10.2.3.4 or 10.2.3.0/24 instead of only 10.2.3.4/32. This should not be too hard to add. 5. this currently replaces the libiptc backend. Alternatives are a compile time or run-time switch. 6. No monitoring support. Theoretically libsystemd could subscribe to the nftables netlink notification interface to e.g. learn when a user has flushed a set/removed a rule etc. I'm currently not sure this is needed due to the usual 'and what do we do now' problem. Would this be deemed acceptable for merging into systemd once the first four points are fixed/implemented? As for retaining the libiptc backend -- I would propose to wait wrt. deciding here. I would test this on stock 4.14-ish kernels to see what will work and what is problematic first. If you want to retain the libiptc backend in any case: Do you have suggestions on how to toggle this? Would a configure switch be enough? Thanks, Florian _______________________________________________ systemd-devel mailing list systemd-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/systemd-devel