Re: [PATCH net] netfilter: Use consistent ct id hash calculation

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> · Wed, 7 Aug 2019 22:31:46 +0200

On Wed, Aug 07, 2019 at 08:01:57PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> > @Florian: by mangling this patch not to use ct->ext, including Dirk's
> > update, conntrackd works again (remember that bug we discussed during
> > NFWS).
> 
> But conntrackd is still borken.
> It can't rely on id recycling  -- it will just take a lot
> longer before it starts to fill up.

Conntrackd does not rely on ID recycling. Conntrackd is in trouble
because of event loss. It seems the event re-delivery routine is
buggy, if the destroy event gets to userspace sooner or later, then
this entry would not get stuck in the cache forever. I can just remove
the check for the ID in userspace, so conntrackd would get rid of the
stale entry by when a new entry with the same tuple shows up (lazy
garbage collection).

> > @@ -470,8 +470,8 @@ u32 nf_ct_get_id(const struct nf_conn *ct)
> >  
> >         a = (unsigned long)ct;
> >         b = (unsigned long)ct->master ^ net_hash_mix(nf_ct_net(ct));
> > -       c = (unsigned long)ct->ext;
> > -       d = (unsigned long)siphash(&ct->tuplehash[IP_CT_DIR_ORIGINAL], sizeof(ct->tuplehash[IP_CT_DIR_ORIGINAL]),
> > +       c = (unsigned long)0;
> > +       d = (unsigned long)siphash(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, sizeof(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
>  
> > I think it's safe to turn this into:
> > 
> >         a = (unsigned long)ct;
> >         b = (unsigned long)ct->master;
> >         c = (unsigned long)nf_ct_net(ct));
> >         d = (unsigned long)siphash(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, sizeof(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
> 
> No, not if we allow using the function before confirmation, the tuple
> can also change in original dir when e.g. queuing before NAT hooks.

Tuple could be artificially built from original source as source and
reply source as destination, those never change IIRC.

This hash-based ID calculation is a simple approach, but it looks weak
/ easy to break.