yyxRoy <yyxroy22@xxxxxxxxx> wrote: > With previous commit https://github.com/torvalds/linux/commit/be0502a > ("netfilter: conntrack: tcp: only close if RST matches exact sequence") > to fight against TCP in-window reset attacks, current version of netfilter > will keep the connection state in ESTABLISHED, but lower the timeout to > that of CLOSE (10 seconds by default) for in-window TCP RSTs, and wait for > the peer to send a challenge ack to restore the connection timeout > (5 mins in tests). > > However, malicious attackers can prevent incurring challenge ACKs by > manipulating the TTL value of RSTs. The attacker can probe the TTL value > between the NAT device and itself and send in-window RST packets with > a TTL value to be decreased to 0 after arriving at the NAT device. > This causes the packet to be dropped rather than forwarded to the > internal client, thus preventing a challenge ACK from being triggered. > As the window of the sequence number is quite large (bigger than 60,000 > in tests) and the sequence number is 16-bit, the attacker only needs to > send nearly 60,000 RST packets with different sequence numbers > (i.e., 1, 60001, 120001, and so on) and one of them will definitely > fall within in the window. > > Therefore we can't simply lower the connection timeout to 10 seconds > (rather short) upon receiving in-window RSTs. With this patch, netfilter > will lower the connection timeout to that of CLOSE only when it receives > RSTs with exact sequence numbers (i.e., old_state != new_state). This effectively ignores most RST packets, which will clog up the conntrack table (established timeout is 5 days). I don't think there is anything sensible that we can do here. Also, one can send train with data packet + rst and we will hit the immediate close conditional: /* Check if rst is part of train, such as * foo:80 > bar:4379: P, 235946583:235946602(19) ack 42 * foo:80 > bar:4379: R, 235946602:235946602(0) ack 42 */ if (ct->proto.tcp.last_index == TCP_ACK_SET && ct->proto.tcp.last_dir == dir && seq == ct->proto.tcp.last_end) break; So even if we'd make this change it doesn't prevent remote induced resets. Conntrack cannot validate RSTs precisely due to lack of information, only the endpoints can do this.