Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > > percpu template would allow us to combine both, I mean, to use the > > > template as a scratchpad area. The template is only used from the same > > > hook point to pass information between hook callbacks. > > > > I found no way to do this. > > > > Consider this: > > > > tcp dport 22 ct timeout set "baz" accept > > meta iifname ct zone set 23 ct timeout set "foo" accept > > ct zone set 42 ct timeout set "bar" accept > > > > (ruleset is not very meaningful, only to illustrate problem) > > > > If a pcpu template is used, we must reset that pcpu scatchpad area > > back to a pristine state whenever a new skb comes in. > > > > Otherwise, if first rule does match, pcpu template might still contain > > 'zone 23' or 'zone 42'. > > Hm, indeed. We can probably set an internal dirty flag in the > tmpl->status bits from nf_conntrack_in() so nft_ct knows the percpu > tmpl needs a cleanup (and such cleanup would be done from > nft_ct/xt_CT). We could preallocate helper and timeout, so they are > just NULL in case now helper/timeout is attached. skb can be dropped, so its possible template never reaches nf_conntrack_in() :-/ > > And we don't know when the 'next' skb comes along. > > nfqueue complicates this further (see the ugly refcnt check in the > > nft_ct.c parts that deal with 'zone set' support). > > From nfqueue, the template is tricky indeed. One thing that we could do is to enforce a full clone of the template in nf_queue. This would place the ugliness only in nfqueue and would remove 'async problem' from the picture. Doesn't address the reuse issue though. > > if skb->nfct is NULL, allocate a new template from the packet path > > and initialise it. > > Hm, could we only follow this approach your describe for the nfqueue > reinject case? I mean, clone the template as you describe only from > the nfqueue escape path. Then, use the percpu approach I described > above for the usual/no nfqueue path. Yes, indeed, we can do this. But it doesn't resolve re-use issue for pcpu area. > Otherwise, we could store the zone u16 value from the nfqueue escape > path to simplify things? Would work for zone case, but not for everything else. The zone is the only attribute that MUST be handled before nf_conntrack_in, so if we agree that all other properties can only be set up post conntrack, thats problem goes away. Its sh*t from user point of view though. I wonder if we can resolve this by enforcing a 'all in one rule' approach. Example: ct zone set 42 ct timeout set "foo" We can discover (from config plane) which nft_ct instance is called first (zone in this case). So, this instance will NFT_BREAK if skb->nfct != NULL. Otherwise, it resets pcpu template and inits the zone. This assumes nfqueue is no longer an issue. Unfortunately this means quite a lot of intrusive changes that I think are no-go for nf.git. > For ct helper, we would need to store the > helper name from the escape path and do a lookup to reset it from the > nf_reinject(). Hm, for cttimeout, it's get more tricky... as we need a > way to reattach the timeout for the template. Yup. > Isn't the template idea itself is very much broken even for xt_CT with > nfqueue? No, because in xt_CT case the templates are readonly objects (except refcount). > > > > + case IPPROTO_TCP: > > > > + ct->timeout = values[1]; > > > > + break; > > > > + default: > > > > + ct->timeout = values[0]; > > > > > > I think we can avoid the assymetry by making UDP and generic trackers > > > point to 1 for the default timeout, so we don't need this. It will be > > > just 4 bytes more per the global timeout array and we don't need this > > > function, it would need a preparation patch for this. > > > > We could also make TCP/SCTP/DCCP use 0, as that is unused. > > I can make this change. > > Yes, that's also fine, note that for TCP we expose through UAPI the > array order I think, so we need to deal with using 0 from the sysctl > and the ctnetlink path. Yes, a mapping is needed (x - 1), so not a big deal.