Re: [PATCH nf] netfilter: nf_tables: rework ct timeout set support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Florian,

On Wed, Aug 22, 2018 at 05:18:36PM +0200, Florian Westphal wrote:
> Using a private template is problematic:
> 
> 1. We can't handle conntrack is already assigned case
> 2. We can't assign both a zone and a timeout policy
>    (zone assigns a conntrack template, so we hit problem 1)

Right, we have a clash between zone and timeout in this case. In the
CT target, we could workaround this limitation by setting the timeout
and zone in one go from the same rule, but this is not possible in
this approach.

> 3. Using a template needs to take care of ct refcount, else we'll
>    eventually free the private template due to ->use underflow.

Right, rule may be gone while the packet is walking on the skbuff
bits, refcount is missing.

> This patch reworks template policy to instead work with existing conntrack.
> 
> As long as such conntrack has not yet been placed into the hash table
> (unconfirmed) we can still add the timeout extension.
> 
> The only caveat is that we now need to update/correct ct->timeout to
> reflect the initial/new state, otherwise the conntrack entry retains the
> default 'new' timeout.
> 
> Side effect of this change is that setting the policy must
> now occur from chains that are evaluated *after* the conntrack lookup
> has taken place.
> 
> No released kernel contains the timeout policy feature yet, so this change
> should be ok.
> 
> Signed-off-by: Florian Westphal <fw@xxxxxxxxx>
> ---
>  nft_ct_timeout_update() is ugly, but i found no better
>  solution.

See below.

>  The alternative is to allow use of templates,
>  but that requires nasty kmemdup() games to clone the
>  template, else we'd modify some percpu/readonly template.

percpu template would allow us to combine both, I mean, to use the
template as a scratchpad area. The template is only used from the same
hook point to pass information between hook callbacks.

I'm thinking about iptables-compat too, if we're planning for more
native mappings, we will need to leave the former behaviour in place,
otherwise we cannot provide a translation via xlate.

Probably we need to support both approaches? I mean, the legacy one
that allows users to set the timeout from the raw table and the new
one which doesn't need template at all. Same thing with ct helpers in
nftables, which don't templates either.

>  include/net/netfilter/nf_conntrack_timeout.h |  2 +-
>  net/netfilter/nft_ct.c                       | 92 +++++++++++++++++++---------
>  2 files changed, 63 insertions(+), 31 deletions(-)
> 
> diff --git a/include/net/netfilter/nf_conntrack_timeout.h b/include/net/netfilter/nf_conntrack_timeout.h
> index d5f62cc6c2ae..3394d75e1c80 100644
> --- a/include/net/netfilter/nf_conntrack_timeout.h
> +++ b/include/net/netfilter/nf_conntrack_timeout.h
> @@ -30,7 +30,7 @@ struct nf_conn_timeout {
>  };
>  
>  static inline unsigned int *
> -nf_ct_timeout_data(struct nf_conn_timeout *t)
> +nf_ct_timeout_data(const struct nf_conn_timeout *t)
>  {
>  	struct nf_ct_timeout *timeout;
>  
> diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
> index 26a8baebd072..234562633e97 100644
> --- a/net/netfilter/nft_ct.c
> +++ b/net/netfilter/nft_ct.c
> @@ -25,6 +25,10 @@
>  #include <net/netfilter/nf_conntrack_timeout.h>
>  #include <net/netfilter/nf_conntrack_l4proto.h>
>  
> +#include <linux/netfilter/nf_conntrack_dccp.h>
> +#include <uapi/linux/netfilter/nf_conntrack_sctp.h>
> +#include <uapi/linux/netfilter/nf_conntrack_tcp.h>
> +
>  struct nft_ct {
>  	enum nft_ct_keys	key:8;
>  	enum ip_conntrack_dir	dir:8;
> @@ -799,36 +803,81 @@ nft_ct_timeout_parse_policy(void *timeouts,
>  }
>  
>  struct nft_ct_timeout_obj {
> -	struct nf_conn		*tmpl;
> +	struct nf_ct_timeout    *timeout;
>  	u8			l4proto;
>  };
>  
> +static void nft_ct_timeout_update(struct nf_conn *ct,
> +				  const struct nf_conn_timeout *timeout)
> +{
> +	const unsigned int *values = nf_ct_timeout_data(timeout);
> +
> +	/* adjust the timeout as per 'new' state. ct is unconfirmed,
> +	 * so the current timestamp must not be added.
> +	 */
> +	if (!values)
> +		return;
> +
> +	switch (nf_ct_protonum(ct)) {
> +	case IPPROTO_DCCP: /* all fall through */
> +	case IPPROTO_SCTP:
> +	case IPPROTO_TCP:
> +		ct->timeout = values[1];
> +		break;
> +	default:
> +		ct->timeout = values[0];

I think we can avoid the assymetry by making UDP and generic trackers
point to 1 for the default timeout, so we don't need this. It will be
just 4 bytes more per the global timeout array and we don't need this
function, it would need a preparation patch for this.

We can probably use nf_ct_refresh() instead here?

> +		break;
> +	}
> +
> +	BUILD_BUG_ON(TCP_CONNTRACK_SYN_SENT != 1);
> +	BUILD_BUG_ON(SCTP_CONNTRACK_CLOSED != 1);
> +	BUILD_BUG_ON(CT_DCCP_REQUEST != 1);
> +}
> +
>  static void nft_ct_timeout_obj_eval(struct nft_object *obj,
>  				    struct nft_regs *regs,
>  				    const struct nft_pktinfo *pkt)
>  {
>  	const struct nft_ct_timeout_obj *priv = nft_obj_data(obj);
>  	struct nf_conn *ct = (struct nf_conn *)skb_nfct(pkt->skb);
> -	struct sk_buff *skb = pkt->skb;
> +	struct nf_conn_timeout *timeout;
>  
> -	if (ct ||
> -	    priv->l4proto != pkt->tprot)
> +	if (priv->l4proto != pkt->tprot)
>  		return;
>  
> -	nf_ct_set(skb, priv->tmpl, IP_CT_NEW);
> +	if (!ct || nf_ct_is_template(ct))
> +		return;
> +
> +	timeout = nf_ct_timeout_find(ct);
> +	if (timeout) {

I think we should turn this into noop for unconfirmed.

	if (nf_ct_is_confirmed(ct)) {
                regs->verdict.code = NFT_BREAK;
                return;
        }

This would simplify things I would expect.



[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux