On Wed, Aug 07, 2019 at 08:01:57PM +0200, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > @Florian: by mangling this patch not to use ct->ext, including Dirk's > > update, conntrackd works again (remember that bug we discussed during > > NFWS). > > But conntrackd is still borken. > It can't rely on id recycling -- it will just take a lot > longer before it starts to fill up. Conntrackd does not rely on ID recycling. Conntrackd is in trouble because of event loss. It seems the event re-delivery routine is buggy, if the destroy event gets to userspace sooner or later, then this entry would not get stuck in the cache forever. I can just remove the check for the ID in userspace, so conntrackd would get rid of the stale entry by when a new entry with the same tuple shows up (lazy garbage collection). > > @@ -470,8 +470,8 @@ u32 nf_ct_get_id(const struct nf_conn *ct) > > > > a = (unsigned long)ct; > > b = (unsigned long)ct->master ^ net_hash_mix(nf_ct_net(ct)); > > - c = (unsigned long)ct->ext; > > - d = (unsigned long)siphash(&ct->tuplehash[IP_CT_DIR_ORIGINAL], sizeof(ct->tuplehash[IP_CT_DIR_ORIGINAL]), > > + c = (unsigned long)0; > > + d = (unsigned long)siphash(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, sizeof(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); > > > I think it's safe to turn this into: > > > > a = (unsigned long)ct; > > b = (unsigned long)ct->master; > > c = (unsigned long)nf_ct_net(ct)); > > d = (unsigned long)siphash(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, sizeof(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); > > No, not if we allow using the function before confirmation, the tuple > can also change in original dir when e.g. queuing before NAT hooks. Tuple could be artificially built from original source as source and reply source as destination, those never change IIRC. This hash-based ID calculation is a simple approach, but it looks weak / easy to break.