[...] > > Allow me to finish the full sentence: my only purpose is not to let > > the TCP layer send strange RST to the _established_ socket due to > > receiving strange out-of-window skbs. > > I don't understand why do you want to modify conntrack at all: conntrack > itself does not send RST packets. And the TCP layer don't send RST packets > to out of window ones either. Thanks for your reply. To normal TCP flow, you're right because the TCP layer doesn't send RST to out-of-window skbs. But the DNAT policy on the server should have converted the port of incoming skb from A_port to B_port as my description in this patch said. It actually doesn't happen. The conntrack clears the skb->_nfct value after returning -NF_ACCEPT in nf_conntrack_tcp_packet() and then DNAT would not convert the A_port to B_port. So the TCP layer is not able to look up the correct socket (client_ip, client_port, server_ip, B_port) because A_port doesn't match B_port, then an RST would be sent to the client. > > The only possibility I see for such packets is an iptables/nftables rule > which rejects packets classified as INVALID by conntrack. > > As Florian suggested, why don't you change that rule? As I said, just for the workaround method, only turning on that sysctl knob can solve the issue. Thanks, Jason > > The conntrack states are not fine-grained to express different TCP states > which covered with INVALID. It was never a good idea to reject INVALID > packets or let them through (leaking internal addresses). > > Best regards, > Jozsef > > > > > Besides, resorting to turning on nf_conntrack_tcp_be_liberal sysctl > > > > knob seems odd to me though it can workaround :S > > > > > > I don't see a better alternative, other than -p tcp -m conntrack > > > --ctstate INVALID -j DROP rule, if you wish for tcp stack to not see > > > such packets. > > > > > > > I would like to prevent sending such an RST as default behaviour. > > > > > > I don't see a way to make this work out of the box, without possible > > > unwanted side effects. > > > > > > MAYBE we could drop IFF we check that the conntrack entry candidate > > > that fails sequence validation has NAT translation applied to it, and > > > thus the '-NF_ACCEPT' packet won't be translated. > > > > > > Not even compile tested: > > > > > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > > > --- a/net/netfilter/nf_conntrack_proto_tcp.c > > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > > > @@ -1256,10 +1256,14 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct, > > > case NFCT_TCP_IGNORE: > > > spin_unlock_bh(&ct->lock); > > > return NF_ACCEPT; > > > - case NFCT_TCP_INVALID: > > > + case NFCT_TCP_INVALID: { > > > + verdict = -NF_ACCEPT; > > > + if (ct->status & IPS_NAT_MASK) > > > + res = NF_DROP; /* skb would miss nat transformation */ > > > > Above line, I guess, should be 'verdict = NF_DROP'? Then this skb > > would be dropped in nf_hook_slow() eventually and would not be passed > > to the TCP layer. > > > > > nf_tcp_handle_invalid(ct, dir, index, skb, state); > > > spin_unlock_bh(&ct->lock); > > > - return -NF_ACCEPT; > > > + return verdict; > > > + } > > > case NFCT_TCP_ACCEPT: > > > break; > > > } > > > > Great! I think your draft patch makes sense really, which takes NAT > > into consideration. > > > > > > > > But I don't really see the advantage compared to doing drop decision in > > > iptables/nftables ruleset. > > > > From our views, especially to kernel developers, you're right: we > > could easily turn on that knob or add a drop policy to prevent it > > happening. Actually I did this in production to prevent such a case. > > It surely works. > > > > But from the views of normal users and those who do not understand how > > it works in the kernel, it looks strange: people may ask why we get > > some unknown RSTs in flight? > > > > > I also have a hunch that someone will eventually complain about this > > > change in behavior. > > > > Well, I still think the patch you suggested is proper and don't know > > why people could complain about it. > > > > Thanks for your patience :) > > > > Thanks, > > Jason > > > > -- > E-mail : kadlec@xxxxxxxxxxxxxxxxx, kadlecsik.jozsef@xxxxxxxxx > PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt > Address : Wigner Research Centre for Physics > H-1525 Budapest 114, POB. 49, Hungary