On Fri, Jul 20, 2018 at 03:28:31PM +0200, Pablo Neira Ayuso wrote: > Hi Mate, > > A few comestic on the _init path, and one concern of probably missing > sanity check, also from the _init path see below. > > On Fri, Jul 20, 2018 at 09:34:14AM +0200, Máté Eckl wrote: > > A great portion of the code is taken from xt_TPROXY.c > > > > There are some changes compared to the iptables implementation: > > - tproxy statement is not terminal here > > - Either address or port has to be specified, but at least one of them > > is necessary. If one of them is not specified, the evaluation will be > > performed with the original attribute of the packet (ie. target port > > is not specified => the packet's dport will be used). > > > > To make this work in inet tables, the tproxy structure has a family > > member (typically called priv->family) which is not necessarily equal to > > ctx->family. > > > > priv->family can have three values legally: > > - NFPROTO_IPV4 if the table family is ip OR if table family is inet, > > but an ipv4 address is specified as a target address. The rule only > > evaluates ipv4 packets in this case. > > - NFPROTO_IPV6 if the table family is ip6 OR if table family is inet, > > but an ipv6 address is specified as a target address. The rule only > > evaluates ipv6 packets in this case. > > - NFPROTO_UNSPEC if the table family is inet AND if only the port is > > specified. The rule will evaluate both ipv4 and ipv6 packets. > > > > Signed-off-by: Máté Eckl <ecklm94@xxxxxxxxx> > > --- > > v2: > > - address or port is now compulsory > > - nf_defrag_ipv{4,6}_enable called in init > > - nft_tproxy now selects NF_DEFRAG_IPV4/6 > > - Add transport header presence test in ipv4 eval (in ipv6 it was > > already present) > > - Add check for the case when address is specified but the rule family > > is not set accordingly > > > > v3: > > - Fix tproxy and context family compatibility test in init > > > > v4: > > - Fix module test macros. Use NF_TABLES_IPV6 as dependency for > > compiling ipv6 related functions. > > - Apply cosmetic changes what Pablo suggested. > > - Use NFPROTO_UNSPEC when only port is specified in inet tables. > > - Conform 5711b4e89319 ("netfilter: nf_tproxy: fix possible non-linear access to transport header") in nf tree > > - More detailed commit message. > > > > include/uapi/linux/netfilter/nf_tables.h | 16 ++ > > net/netfilter/Kconfig | 10 + > > net/netfilter/Makefile | 1 + > > net/netfilter/nft_tproxy.c | 320 +++++++++++++++++++++++ > > 4 files changed, 347 insertions(+) > > create mode 100644 net/netfilter/nft_tproxy.c > > > > diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h > > index 89438e68dc03..e58d2b70dde7 100644 > > --- a/include/uapi/linux/netfilter/nf_tables.h > > +++ b/include/uapi/linux/netfilter/nf_tables.h > > @@ -1250,6 +1250,22 @@ enum nft_nat_attributes { > > }; > > #define NFTA_NAT_MAX (__NFTA_NAT_MAX - 1) > > > > +/** > > + * enum nft_tproxy_attributes - nf_tables tproxy expression netlink attributes > > + * > > + * NFTA_TPROXY_FAMILY: Target address family (NLA_U32: nft_registers) > > + * NFTA_TPROXY_REG_ADDR: Target address register (NLA_U32: nft_registers) > > + * NFTA_TPROXY_REG_PORT: Target port register (NLA_U32: nft_registers) > > + */ > > +enum nft_tproxy_attributes { > > + NFTA_TPROXY_UNSPEC, > > + NFTA_TPROXY_FAMILY, > > + NFTA_TPROXY_REG_ADDR, > > + NFTA_TPROXY_REG_PORT, > > + __NFTA_TPROXY_MAX > > +}; > > +#define NFTA_TPROXY_MAX (__NFTA_TPROXY_MAX - 1) > > + > > /** > > * enum nft_masq_attributes - nf_tables masquerade expression attributes > > * > > diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig > > index 6c65d756e603..fc570c22a860 100644 > > --- a/net/netfilter/Kconfig > > +++ b/net/netfilter/Kconfig > > @@ -633,6 +633,16 @@ config NFT_SOCKET > > This option allows matching for the presence or absence of a > > corresponding socket and its attributes. > > > > +config NFT_TPROXY > > + tristate "Netfilter nf_tables tproxy support" > > + depends on IPV6 || IPV6=n > > + select NF_DEFRAG_IPV4 > > + select NF_DEFRAG_IPV6 if NF_TABLES_IPV6 > > + select NF_TPROXY_IPV4 > > + select NF_TPROXY_IPV6 if NF_TABLES_IPV6 > > + help > > + This makes transparent proxy support available in nftables. > > + > > if NF_TABLES_NETDEV > > > > config NF_DUP_NETDEV > > diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile > > index 0b3851e825fa..ea35f206e49d 100644 > > --- a/net/netfilter/Makefile > > +++ b/net/netfilter/Makefile > > @@ -109,6 +109,7 @@ obj-$(CONFIG_NFT_FIB_INET) += nft_fib_inet.o > > obj-$(CONFIG_NFT_FIB_NETDEV) += nft_fib_netdev.o > > obj-$(CONFIG_NF_OSF) += nf_osf.o > > obj-$(CONFIG_NFT_SOCKET) += nft_socket.o > > +obj-$(CONFIG_NFT_TPROXY) += nft_tproxy.o > > > > # nf_tables netdev > > obj-$(CONFIG_NFT_DUP_NETDEV) += nft_dup_netdev.o > > diff --git a/net/netfilter/nft_tproxy.c b/net/netfilter/nft_tproxy.c > > new file mode 100644 > > index 000000000000..a9290035c651 > > --- /dev/null > > +++ b/net/netfilter/nft_tproxy.c > > @@ -0,0 +1,320 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#include <linux/module.h> > > +#include <linux/netfilter/nf_tables.h> > > +#include <net/netfilter/nf_tables.h> > > +#include <net/netfilter/nf_tables_core.h> > > +#include <net/netfilter/nf_tproxy.h> > > +#include <net/inet_sock.h> > > +#include <net/tcp.h> > > +#include <linux/if_ether.h> > > +#include <net/netfilter/ipv4/nf_defrag_ipv4.h> > > +#if IS_ENABLED(CONFIG_NF_TABLES_IPV6) > > +#include <net/netfilter/ipv6/nf_defrag_ipv6.h> > > +#endif > > + > > +struct nft_tproxy { > > + enum nft_registers sreg_addr:8; > > + enum nft_registers sreg_port:8; > > + u8 family; > > +}; > > + > > +static void nft_tproxy_eval_v4(const struct nft_expr *expr, > > + struct nft_regs *regs, > > + const struct nft_pktinfo *pkt) > > +{ > > + const struct nft_tproxy *priv = nft_expr_priv(expr); > > + struct sk_buff *skb = pkt->skb; > > + struct sock *sk; > > + const struct iphdr *iph = ip_hdr(skb); > > + struct udphdr _hdr, *hp; > > + __be32 taddr = 0; > > + __be16 tport = 0; > > + > > + hp = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_hdr), &_hdr); > > + if (!hp) { > > + regs->verdict.code = NFT_BREAK; > > + return; > > + } > > + > > + /* check if there's an ongoing connection on the packet addresses, this > > + * happens if the redirect already happened and the current packet > > + * belongs to an already established connection > > + */ > > + sk = nf_tproxy_get_sock_v4(nft_net(pkt), skb, iph->protocol, > > + iph->saddr, iph->daddr, > > + hp->source, hp->dest, > > + skb->dev, NF_TPROXY_LOOKUP_ESTABLISHED); > > + > > + if (priv->sreg_addr) > > + taddr = regs->data[priv->sreg_addr]; > > + taddr = nf_tproxy_laddr4(skb, taddr, iph->daddr); > > + > > + if (priv->sreg_port) > > + tport = regs->data[priv->sreg_port]; > > + if (!tport) > > + tport = hp->dest; > > + > > + /* UDP has no TCP_TIME_WAIT state, so we never enter here */ > > + if (sk && sk->sk_state == TCP_TIME_WAIT) { > > + /* reopening a TIME_WAIT connection needs special handling */ > > + sk = nf_tproxy_handle_time_wait4(nft_net(pkt), skb, taddr, tport, sk); > > + } > > + else if (!sk) { > > + /* no, there's no established connection, check if > > + * there's a listener on the redirected addr/port > > + */ > > + sk = nf_tproxy_get_sock_v4(nft_net(pkt), skb, iph->protocol, > > + iph->saddr, taddr, > > + hp->source, tport, > > + skb->dev, NF_TPROXY_LOOKUP_LISTENER); > > + } > > + > > + if (sk && nf_tproxy_sk_is_transparent(sk)) > > + nf_tproxy_assign_sock(skb, sk); > > + else > > + regs->verdict.code = NFT_BREAK; > > +} > > + > > +#if IS_ENABLED(CONFIG_NF_TABLES_IPV6) > > +static void nft_tproxy_eval_v6(const struct nft_expr *expr, > > + struct nft_regs *regs, > > + const struct nft_pktinfo *pkt) > > +{ > > + const struct nft_tproxy *priv = nft_expr_priv(expr); > > + struct sk_buff *skb = pkt->skb; > > + struct sock *sk; > > + const struct ipv6hdr *iph = ipv6_hdr(skb); > > + struct udphdr _hdr, *hp; > > + struct in6_addr taddr = {0}; > > + __be16 tport = 0; > > + int thoff = pkt->xt.thoff; > > + int l4proto; > > + > > + if (!pkt->tprot_set) { > > + regs->verdict.code = NFT_BREAK; > > + return; > > + } > > + l4proto = pkt->tprot; > > + > > + hp = skb_header_pointer(skb, thoff, sizeof(_hdr), &_hdr); > > + if (hp == NULL) { > > + regs->verdict.code = NFT_BREAK; > > + return; > > + } > > + > > + /* check if there's an ongoing connection on the packet addresses, this > > + * happens if the redirect already happened and the current packet > > + * belongs to an already established connection > > + */ > > + sk = nf_tproxy_get_sock_v6(nft_net(pkt), skb, thoff, l4proto, > > + &iph->saddr, &iph->daddr, > > + hp->source, hp->dest, > > + nft_in(pkt), NF_TPROXY_LOOKUP_ESTABLISHED); > > + > > + if (priv->sreg_addr) > > + memcpy(&taddr, ®s->data[priv->sreg_addr], sizeof(taddr)); > > + taddr = *nf_tproxy_laddr6(skb, &taddr, &iph->daddr); > > + > > + if (priv->sreg_port) > > + tport = regs->data[priv->sreg_port]; > > + if (!tport) > > + tport = hp->dest; > > + > > + /* UDP has no TCP_TIME_WAIT state, so we never enter here */ > > + if (sk && sk->sk_state == TCP_TIME_WAIT) { > > + /* reopening a TIME_WAIT connection needs special handling */ > > + sk = nf_tproxy_handle_time_wait6(skb, l4proto, thoff, > > + nft_net(pkt), > > + &taddr, > > + tport, > > + sk); > > + } > > + else if (!sk) { > > + /* no there's no established connection, check if > > + * there's a listener on the redirected addr/port > > + */ > > + sk = nf_tproxy_get_sock_v6(nft_net(pkt), skb, thoff, > > + l4proto, &iph->saddr, &taddr, > > + hp->source, tport, > > + nft_in(pkt), NF_TPROXY_LOOKUP_LISTENER); > > + } > > + > > + /* NOTE: assign_sock consumes our sk reference */ > > + if (sk && nf_tproxy_sk_is_transparent(sk)) > > + nf_tproxy_assign_sock(skb, sk); > > + else > > + regs->verdict.code = NFT_BREAK; > > +} > > +#endif > > + > > +static void nft_tproxy_eval(const struct nft_expr *expr, > > + struct nft_regs *regs, > > + const struct nft_pktinfo *pkt) > > +{ > > + const struct nft_tproxy *priv = nft_expr_priv(expr); > > + > > + switch (nft_pf(pkt)) { > > + case NFPROTO_IPV4: > > + switch (priv->family) { > > + case NFPROTO_IPV4: > > + case NFPROTO_UNSPEC: > > + nft_tproxy_eval_v4(expr, regs, pkt); > > + return; > > + } > > + break; > > +#if IS_ENABLED(CONFIG_NF_TABLES_IPV6) > > + case NFPROTO_IPV6: > > + switch (priv->family) { > > + case NFPROTO_IPV6: > > + case NFPROTO_UNSPEC: > > + nft_tproxy_eval_v6(expr, regs, pkt); > > + return; > > + } > > +#endif > > + } > > + regs->verdict.code = NFT_BREAK; > > +} > > + > > +static const struct nla_policy nft_tproxy_policy[NFTA_TPROXY_MAX + 1] = { > > + [NFTA_TPROXY_FAMILY] = { .type = NLA_U32 }, > > + [NFTA_TPROXY_REG_ADDR] = { .type = NLA_U32 }, > > + [NFTA_TPROXY_REG_PORT] = { .type = NLA_U32 }, > > +}; > > + > > +static int nft_tproxy_init(const struct nft_ctx *ctx, > > + const struct nft_expr *expr, > > + const struct nlattr * const tb[]) > > +{ > > + struct nft_tproxy *priv = nft_expr_priv(expr); > > + unsigned int alen = 0; > > + int err; > > Probably check here: > > if (!tb[NFTA_TPROXY_FAMILY]) > return -EINVAL; > > family = ...; > > So we can reuse the switch() below... > > > + > > + switch (ctx->family) { > > + case NFPROTO_IPV4: > > +#if IS_ENABLED(CONFIG_NF_TABLES_IPV6) > > + case NFPROTO_IPV6: > > +#endif > > + case NFPROTO_INET: > > I think you have to update this to NFPROTO_UNSPEC. No because this is the ctx->family, not the priv->family. This has to be done so that a tproxy statement cannot be added to a netdev (or arp, etc.) table. > > > + break; > > + default: > > + return -EOPNOTSUPP; > > + } > > + > > + if (!tb[NFTA_TPROXY_FAMILY] || > > + (!tb[NFTA_TPROXY_REG_ADDR] && !tb[NFTA_TPROXY_REG_PORT])) > > + return -EINVAL; > > + > > + priv->family = ntohl(nla_get_be32(tb[NFTA_TPROXY_FAMILY])); > > + switch (ctx->family) { > > To do what we're doing this in this switch() ... > > > + case NFPROTO_IPV4: > > + if (priv->family != NFPROTO_IPV4) > > + return -EINVAL; > > + break; > > + case NFPROTO_IPV6: > > + if (priv->family != NFPROTO_IPV6) > > + return -EINVAL; > > + break; > > + } > > + > > + /* Address is specified but the rule family is not set accordingly */ > > + if (priv->family == NFPROTO_UNSPEC && tb[NFTA_TPROXY_REG_ADDR]) > > + return -EINVAL; > > With the change I'm proposing above, you can do all these attribute > sanity checks at the very beginning of the function. I see your point. See later. > > > + > > + switch (priv->family) { > > + case NFPROTO_IPV4: > > I'm missing a check like: > > if (priv->family != NFPROTO_UNSPEC && > ctx->family != priv->family) > return -EINVAL; > > somewhere. This switch basically does the same in a reverse logic, doesn't it? switch (ctx->family) { case NFPROTO_IPV4: if (priv->family != NFPROTO_IPV4) return -EINVAL; break; case NFPROTO_IPV6: if (priv->family != NFPROTO_IPV6) return -EINVAL; break; } > > So we don't allow crazy things like, priv->family == NFPROTO_IPV6 from > ctx->family == NFPROTO_IPV4... I may be wrong but I think it's still > possible with this code. The switch above rejects this with -EINVAL. How about this: static int nft_tproxy_init(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nlattr * const tb[]) { struct nft_tproxy *priv = nft_expr_priv(expr); unsigned int alen = 0; int err; if (!tb[NFTA_TPROXY_FAMILY] || (!tb[NFTA_TPROXY_REG_ADDR] && !tb[NFTA_TPROXY_REG_PORT])) return -EINVAL; priv->family = ntohl(nla_get_be32(tb[NFTA_TPROXY_FAMILY])); switch (ctx->family) { case NFPROTO_IPV4: if (priv->family != NFPROTO_IPV4) return -EINVAL; break; #if IS_ENABLED(CONFIG_NF_TABLES_IPV6) case NFPROTO_IPV6: if (priv->family != NFPROTO_IPV6) return -EINVAL; break; #endif case NFPROTO_INET: break; default: return -EOPNOTSUPP; } /* Address is specified but the rule family is not set accordingly */ if (priv->family == NFPROTO_UNSPEC && tb[NFTA_TPROXY_REG_ADDR]) return -EINVAL; [...] I think this addressess all of your concerns. > > > + alen = FIELD_SIZEOF(union nf_inet_addr, in); > > + err = nf_defrag_ipv4_enable(ctx->net); > > + if (err) > > + return err; > > + break; > > +#if IS_ENABLED(CONFIG_NF_TABLES_IPV6) > > + case NFPROTO_IPV6: > > + alen = FIELD_SIZEOF(union nf_inet_addr, in6); > > + err = nf_defrag_ipv6_enable(ctx->net); > > + if (err) > > + return err; > > + break; > > +#endif > > + case NFPROTO_UNSPEC: > > + /* No address is specified here */ > > + err = nf_defrag_ipv4_enable(ctx->net); > > + if (err) > > + return err; > > + err = nf_defrag_ipv6_enable(ctx->net); > > + if (err) > > + return err; > > + break; > > + default: > > + return -EOPNOTSUPP; > > + } > > + > > + if (tb[NFTA_TPROXY_REG_ADDR]) { > > + priv->sreg_addr = nft_parse_register(tb[NFTA_TPROXY_REG_ADDR]); > > + err = nft_validate_register_load(priv->sreg_addr, alen); > > + if (err < 0) > > + return err; > > + } > > + > > + if (tb[NFTA_TPROXY_REG_PORT]) { > > + priv->sreg_port = nft_parse_register(tb[NFTA_TPROXY_REG_PORT]); > > + err = nft_validate_register_load(priv->sreg_port, sizeof(u16)); > > + if (err < 0) > > + return err; > > + } > > + > > + return 0; > > +} > > + > > +static int nft_tproxy_dump(struct sk_buff *skb, > > + const struct nft_expr *expr) > > +{ > > + const struct nft_tproxy *priv = nft_expr_priv(expr); > > + > > + if (nla_put_be32(skb, NFTA_TPROXY_FAMILY, htonl(priv->family))) > > + return -1; > > + > > + if (priv->sreg_addr && > > + nft_dump_register(skb, NFTA_TPROXY_REG_ADDR, priv->sreg_addr)) > > + return -1; > > + > > + if (priv->sreg_port && > > + nft_dump_register(skb, NFTA_TPROXY_REG_PORT, priv->sreg_port)) > > + return -1; > > + > > + return 0; > > +} > > + > > +static struct nft_expr_type nft_tproxy_type; > > +static const struct nft_expr_ops nft_tproxy_ops = { > > + .type = &nft_tproxy_type, > > + .size = NFT_EXPR_SIZE(sizeof(struct nft_tproxy)), > > + .eval = nft_tproxy_eval, > > + .init = nft_tproxy_init, > > + .dump = nft_tproxy_dump, > > +}; > > + > > +static struct nft_expr_type nft_tproxy_type __read_mostly = { > > + .name = "tproxy", > > + .ops = &nft_tproxy_ops, > > + .policy = nft_tproxy_policy, > > + .maxattr = NFTA_TPROXY_MAX, > > + .owner = THIS_MODULE, > > +}; > > + > > +static int __init nft_tproxy_module_init(void) > > +{ > > + return nft_register_expr(&nft_tproxy_type); > > +} > > + > > +static void __exit nft_tproxy_module_exit(void) > > +{ > > + nft_unregister_expr(&nft_tproxy_type); > > +} > > + > > +module_init(nft_tproxy_module_init); > > +module_exit(nft_tproxy_module_exit); > > + > > +MODULE_LICENSE("GPL"); > > +MODULE_AUTHOR("Máté Eckl"); > > +MODULE_DESCRIPTION("nf_tables tproxy support module"); > > +MODULE_ALIAS_NFT_EXPR("tproxy"); > > -- > > ecklm > > > > -- > > To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html