Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > On Thu, Mar 7, 2024 at 8:00 PM Florian Westphal <fw@xxxxxxxxx> wrote: > > > > Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > > > > This change disables most of the tcp_in_window() test, this will > > > > pretend everything is fine even though tcp_in_window says otherwise. > > > > > > Thanks for the information. It does make sense. > > > > > > What I've done is quite similar to nf_conntrack_tcp_be_liberal sysctl > > > knob which you also pointed out. It also pretends to ignore those > > > out-of-window skbs. > > > > > > > > > > > You could: > > > > - drop invalid tcp packets in input hook > > > > > > How about changing the return value only as below? Only two cases will > > > be handled: > > > > > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c > > > b/net/netfilter/nf_conntrack_proto_tcp.c > > > index ae493599a3ef..c88ce4cd041e 100644 > > > --- a/net/netfilter/nf_conntrack_proto_tcp.c > > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > > > @@ -1259,7 +1259,7 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct, > > > case NFCT_TCP_INVALID: > > > nf_tcp_handle_invalid(ct, dir, index, skb, state); > > > spin_unlock_bh(&ct->lock); > > > - return -NF_ACCEPT; > > > + return -NF_DROP; > > > > Lets not do this. conntrack should never drop packets and defer to ruleset > > whereever possible. > > Hmm, sorry, it is against my understanding. > > If we cannot return -NF_DROP, why have we already added some 'return > NF_DROP' in the nf_conntrack_handle_packet() function? And why does > this test statement exist? Sure we can drop. But we should only do it if there is no better alternative. > nf_conntrack_in() > -> nf_conntrack_handle_packet() > -> if (ret <= 0) { > if (ret == -NF_DROP) NF_CT_STAT_INC_ATOMIC(state->net, drop); AFAICS this only happens when we receive syn for an existing conntrack that is being removed already so we'd expect next syn to create a new connection. Feel free to send patches that replace drop with -accept where possible/where it makes sense, but I don't think the TCP_CONNTRACK_SYN_SENT one can reasonably be avoided. > My only purpose is not to let the TCP layer sending strange RST to the > right flow. AFAIU tcp layer is correct, no? Out of the blue packet to some listener socket? > Besides, resorting to turning on nf_conntrack_tcp_be_liberal sysctl > knob seems odd to me though it can workaround :S I don't see a better alternative, other than -p tcp -m conntrack --ctstate INVALID -j DROP rule, if you wish for tcp stack to not see such packets. > I would like to prevent sending such an RST as default behaviour. I don't see a way to make this work out of the box, without possible unwanted side effects. MAYBE we could drop IFF we check that the conntrack entry candidate that fails sequence validation has NAT translation applied to it, and thus the '-NF_ACCEPT' packet won't be translated. Not even compile tested: diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -1256,10 +1256,14 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct, case NFCT_TCP_IGNORE: spin_unlock_bh(&ct->lock); return NF_ACCEPT; - case NFCT_TCP_INVALID: + case NFCT_TCP_INVALID: { + verdict = -NF_ACCEPT; + if (ct->status & IPS_NAT_MASK) + res = NF_DROP; /* skb would miss nat transformation */ nf_tcp_handle_invalid(ct, dir, index, skb, state); spin_unlock_bh(&ct->lock); - return -NF_ACCEPT; + return verdict; + } case NFCT_TCP_ACCEPT: break; } But I don't really see the advantage compared to doing drop decision in iptables/nftables ruleset. I also have a hunch that someone will eventually complain about this change in behavior.