Hello Florian, On Thu, Mar 7, 2024 at 5:33 PM Florian Westphal <fw@xxxxxxxxx> wrote: > > Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > > client_ip:client_port <--> server_ip:b_port > > > > Then, some strange skbs from client or gateway, say, out-of-window > > skbs are sent to the server_ip:a_port (not b_port) due to DNAT > > clearing skb->_nfct value in nf_conntrack_in() function. Why? > > Because the tcp_in_window() considers the incoming skb as an > > invalid skb by returning NFCT_TCP_INVALID. > > So far everything is as intended. > > > I think, even we have set DNAT policy, it would be better if the > > whole process/behaviour adheres to the original TCP behaviour. > > > > Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx> > > --- > > net/netfilter/nf_conntrack_proto_tcp.c | 6 ++---- > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > > index ae493599a3ef..3f3e620f3969 100644 > > --- a/net/netfilter/nf_conntrack_proto_tcp.c > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > > @@ -1253,13 +1253,11 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct, > > res = tcp_in_window(ct, dir, index, > > skb, dataoff, th, state); > > switch (res) { > > - case NFCT_TCP_IGNORE: > > - spin_unlock_bh(&ct->lock); > > - return NF_ACCEPT; > > case NFCT_TCP_INVALID: > > nf_tcp_handle_invalid(ct, dir, index, skb, state); > > + case NFCT_TCP_IGNORE: > > spin_unlock_bh(&ct->lock); > > - return -NF_ACCEPT; > > + return NF_ACCEPT; > > This looks wrong. -NF_ACCEPT means 'pass packet, but its not part > of the connection' (packet will match --ctstate INVALID check). > > This change disables most of the tcp_in_window() test, this will > pretend everything is fine even though tcp_in_window says otherwise. Thanks for the information. It does make sense. What I've done is quite similar to nf_conntrack_tcp_be_liberal sysctl knob which you also pointed out. It also pretends to ignore those out-of-window skbs. > > You could: > - drop invalid tcp packets in input hook How about changing the return value only as below? Only two cases will be handled: diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index ae493599a3ef..c88ce4cd041e 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -1259,7 +1259,7 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct, case NFCT_TCP_INVALID: nf_tcp_handle_invalid(ct, dir, index, skb, state); spin_unlock_bh(&ct->lock); - return -NF_ACCEPT; + return -NF_DROP; case NFCT_TCP_ACCEPT: break; } Then it will be dropped naturally in nf_hook_slow(). > - set nf_conntrack_tcp_be_liberal=1 Sure, it can workaround this case, but I would like to refuse the out-of-window in netfilter or TCP layer as default instead of turning on this sysctl knob. If I understand wrong, please correct me. Thanks, Jason > > both will avoid this 'rst' issue.