On Wed, Oct 14, 2020 at 02:06:28AM +0200, Pablo Neira Ayuso wrote: > On Fri, Oct 09, 2020 at 10:05:48PM +0200, Florian Westphal wrote: > > Jozsef Kadlecsik <kadlec@xxxxxxxxxxxxx> wrote: > > > > The "delay unregister" remark was wrt. the "all rules were deleted" > > > > case, i.e. add a "grace period" rather than acting right away when > > > > conntrack use count did hit 0. > > > > > > Now I understand it, thanks really. The hooks are removed, so conntrack > > > cannot "see" the packets and the entries become stale. > > > > Yes. > > > > > What is the rationale behind "remove the conntrack hooks when there are no > > > rule left referring to conntrack"? Performance optimization? But then the > > > content of the whole conntrack table could be deleted too... ;-) > > > > Yes, this isn't the case at the moment -- only hooks are removed, > > entries will eventually time out. > > > > > > Conntrack entries are not removed, only the base hooks get unregistered. > > > > This is a problem for tcp window tracking. > > > > > > > > When re-register occurs, kernel is supposed to switch the existing > > > > entries to "loose" mode so window tracking won't flag packets as > > > > invalid, but apparently this isn't enough to handle keepalive case. > > > > > > "loose" (nf_ct_tcp_loose) mode doesn't disable window tracking, it > > > enables/disables picking up already established connections. > > > > > > nf_ct_tcp_be_liberal would disable TCP window checking (but not tracking) > > > for non RST packets. > > > > You are right, mixup on my part. > > > > > But both seems to be modified only via the proc entries. > > > > Yes, we iterate table on re-register and modify the existing entries. > > For iptables-nft, it might be possible to avoid this deregister + > register ct hooks in the same transaction: Maybe add something like > nf_ct_netns_get_all() to bump refcounters by one _iff_ they are > 0 > before starting the transaction processing, then call > nf_ct_netns_put_all() which decrements refcounters and unregister > hooks if they reach 0. Hm, scratch that, put_all() would create an imbalance with this conditional increment. > The only problem with this approach is that this pulls in the > conntrack module, to solve that, struct nf_ct_hook in > net/netfilter/core.c could be used to store the reference to > ->netns_get_all and ->net_put_all. > > Legacy would still be flawed though.