Re: Connection timeouts due to INVALID state rule

Anton Danilov <littlesmilingcloud@xxxxxxxxx> · Mon, 8 Jul 2019 13:51:50 +0300

Hello.

To avoid this issue you can tune the conntrack behaviour with sysctl:
sysctl -w net.netfilter.nf_conntrack_tcp_be_liberal=1
sysctl -w net.netfilter.nf_conntrack_tcp_loose=1

---
>From https://www.kernel.org/doc/Documentation/networking/nf_conntrack-sysctl.txt
:
nf_conntrack_tcp_be_liberal - BOOLEAN
0 - disabled (default)
not 0 - enabled

Be conservative in what you do, be liberal in what you accept from others.
If it's non-zero, we mark only out of window RST segments as INVALID.

nf_conntrack_tcp_loose - BOOLEAN
0 - disabled
not 0 - enabled (default)

If it is set to zero, we disable picking up already established
connections.

On Mon, 8 Jul 2019 at 01:45, Will Storey <will@xxxxxxxxxxxxx> wrote:
>
> Hello,
>
> I've been experiencing sporadic timeouts when connecting to daemons on
> 127.0.0.1. I narrowed the cause down to an iptables INPUT rule that blocks
> INVALID state packets:
>
>  603K   24M DROP  all  --  *  *  0.0.0.0/0  0.0.0.0/0   state INVALID
>
> I can work around this by allowing everything on lo before this rule, but
> I'm wondering if this is expected or not.
>
> Here's more about the situation:
>
> All involved systems are running Ubuntu Bionic with kernel
> 4.15.0-52-generic.
>
> On systems with the problem, there are half open TCP connections:
>
> tcp        0      0 127.0.0.1:2348          127.0.0.1:47268         ESTABLISHED
>
> When a client connects with source port 47268, it gets stuck in SYN_SENT
> and eventually times out:
>
> 22:09:17.601482 IP (tos 0x0, ttl 64, id 53505, offset 0, flags [DF], proto TCP (6), length 60)
>     127.0.0.1.47268 > 127.0.0.1.2348: Flags [S], cksum 0xfe30 (incorrect -> 0x02e6), seq 3436316390, win 43690, options [mss 65495,sackOK,TS val 712761924 ecr 0,nop,wscale 7], length 0
> 22:09:17.601487 IP (tos 0x0, ttl 64, id 42105, offset 0, flags [DF], proto TCP (6), length 52)
>     127.0.0.1.2348 > 127.0.0.1.47268: Flags [.], cksum 0xfe28 (incorrect -> 0x08f5), seq 1489307482, ack 3500129728, win 2309, options [nop,nop,TS val 712761924 ecr 696680490], length 0
> 22:09:18.629342 IP (tos 0x0, ttl 64, id 53506, offset 0, flags [DF], proto TCP (6), length 60)
>     127.0.0.1.47268 > 127.0.0.1.2348: Flags [S], cksum 0xfe30 (incorrect -> 0xfee1), seq 3436316390, win 43690, options [mss 65495,sackOK,TS val 712762952 ecr 0,nop,wscale 7], length 0
> 22:09:18.629469 IP (tos 0x0, ttl 64, id 42106, offset 0, flags [DF], proto TCP (6), length 52)
>     127.0.0.1.2348 > 127.0.0.1.47268: Flags [.], cksum 0xfe28 (incorrect -> 0x04f1), seq 0, ack 1, win 2309, options [nop,nop,TS val 712762952 ecr 696680490], length 0
>
> It repeats like this (SYN then ACK) until timeout.
>
> My understanding is that I should see a RST from the client and the
> handshake beginning from scratch. Indeed, if I create a half open TCP
> connection to try to replicate the issue, that's what I see:
>
> 14:19:47.429668 IP (tos 0x0, ttl 64, id 35002, offset 0, flags [DF], proto TCP (6), length 60)
>     127.0.0.1.59118 > 127.0.0.1.2348: Flags [S], cksum 0xfe30 (incorrect -> 0xf9f1), seq 1911409434, win 43690, options [mss 65495,sackOK,TS val 2900480312 ecr 0,nop,wscale 7], length 0
> 14:19:47.429698 IP (tos 0x0, ttl 64, id 44792, offset 0, flags [DF], proto TCP (6), length 52)
>     127.0.0.1.2348 > 127.0.0.1.59118: Flags [.], cksum 0xfe28 (incorrect -> 0x81ca), seq 1940761408, ack 1119853882, win 342, options [nop,nop,TS val 2900480312 ecr 2900155296], length 0
> 14:19:47.429724 IP (tos 0x0, ttl 64, id 50333, offset 0, flags [DF], proto TCP (6), length 40)
>     127.0.0.1.59118 > 127.0.0.1.2348: Flags [R], cksum 0xe1c9 (correct), seq 1119853882, win 0, length 0
> 14:19:48.452510 IP (tos 0x0, ttl 64, id 35003, offset 0, flags [DF], proto TCP (6), length 60)
>     127.0.0.1.59118 > 127.0.0.1.2348: Flags [S], cksum 0xfe30 (incorrect -> 0xf5f2), seq 1911409434, win 43690, options [mss 65495,sackOK,TS val 2900481335 ecr 0,nop,wscale 7], length 0
> 14:19:48.452533 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
>     127.0.0.1.2348 > 127.0.0.1.59118: Flags [S.], cksum 0xfe30 (incorrect -> 0x1929), seq 2748298959, ack 1911409435, win 43690, options [mss 65495,sackOK,TS val 2900481335 ecr 2900481335,nop,wscale 7], length 0
> 14:19:48.452547 IP (tos 0x0, ttl 64, id 35004, offset 0, flags [DF], proto TCP (6), length 52)
>     127.0.0.1.59118 > 127.0.0.1.2348: Flags [.], cksum 0xfe28 (incorrect -> 0xeb6d), seq 1911409435, ack 2748298960, win 342, options [nop,nop,TS val 2900481335 ecr 2900481335], length 0
>
> From what I can gather, either the ACK from the server or the RST from the
> client (which doesn't show in the tcpdump if it is occurring) is getting
> blocked by the INVALID state rule. If I allow everything on lo, I see the
> RST and the connection succeeds.
>
> I've tried setting nf_conntrack_log_invalid to 255, but I don't see any
> logs about what's invalid.
>
> I'm at a loss to explain why these packets are invalid. I'm also curious
> why I'm unable to replicate the issue. There's seems to be something
> special about certain half open connections.
>
> I've attached packet captures. One shows a case where the timeout happens
> (synack_loop_timeout). The other is a case where I created a half open
> connection and the timeout didn't occur (expected_rst).
>
> What do you think?
>
> Thank you!
>
> Will

-- 
Anton Danilov.