Re: PATCH: "invalid SYNIN=" - a patch and a question

Krzysztof Oledzki <ole@xxxxxx> · Wed, 3 Oct 2007 15:06:37 +0200 (CEST)

On Wed, 26 Sep 2007, Krzysztof Oledzki wrote:

Hello,

Attached patch should fix missing space between "SYN" and "IN=".

nf_ct_tcp: invalid SYNIN= OUT= SRC=192.168.150.16 DST=192.168.50.21 LEN=60 
TOS=0x00 PREC=0x00 TTL=64 ID=19810 DF PROTO=TCP SPT=43183 DPT=80 
SEQ=3917241971 ACK=0 WINDOW=5840 RES=0x00 SYN URGP=0 OPT 
(020405B40402080A3C14363B0000000001030307) UID=451

192.168.150.16 <- my local ip address
192.168.50.21  <- remote server

My question is about this message. Apparently there is someting wrong with my 
configuration. I'm running http proxy/load balancer so my server makes _a 
lot_ of connections for/to the same address IP pair. I noticed that many of 
them were not successful. I know about TIMEWAIT issue, but as long there are 
enough free ports for current connection rate (in my situation it is about 
100/s) it should not be a problem.

So, with net.netfilter.nf_conntrack_log_invalid set to 255 I get:

grep "invalid SYNIN" /var/log/syslog |wc -l
1186

Could the problem be that default 
net.netfilter.nf_conntrack_tcp_timeout_time_wait is 120s by default, and 
TCP_TIMEWAIT_LEN is 60:

/usr/src/linux/include/net/tcp.h:#define TCP_TIMEWAIT_LEN (60*HZ) /* how long 
to wait to destroy TIME-WAIT

I tried to change net.netfilter.nf_conntrack_tcp_timeout_time_wait to 60 but 
this does not help much.

Small update to this issue. It seems that both ip stack and netfilter 
indeed handle connections using different timers:

# wget --bind-address 192.168.0.1 192.168.129.28 -O /dev/null

--14:36:00--  http://192.168.129.28/
           => `/dev/null'
Connecting to 192.168.129.28:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 915 [text/html]

100%[==========================================================================================>] 
915           --.--K/s

14:36:00 (13.43 MB/s) - `/dev/null' saved [915/915]

# conntrack -L |grep =192.168.129.28|grep 192.168.0.1:
tcp      6 119 TIME_WAIT src=192.168.0.1 dst=192.168.129.28 sport=18300 dport=80 packets=6 bytes=422 src=192.168.129.28 dst=192.168.0.1 sport=80 dport=18300 packets=4 bytes=1475 [ASSURED] mark=0 use=1

# ss -anto |egrep "192.168.0.1.*192.168.129.28":
TIME-WAIT  0      0               192.168.0.1:18300       192.168.129.28:80     timer:(timewait,58sec,0)

After 60s kernel is able to reuse this (18300) port but it will be blocked 
by local netfilter with this "invalid SYN" message. This is especially 
more likely in newer kernels with tcp port randomization.

This does not solve my problem but it think we should consider changing 
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait from 120s to 60s.

I also wondering if the code from nf_conntrack_proto_tcp.c is correct:

--- cut here ---
new_state = tcp_conntracks[dir][index][old_state];

switch (new_state)
(...)
case TCP_CONNTRACK_SYN_SENT:
if (old_state < TCP_CONNTRACK_TIME_WAIT)
        break;
if ((conntrack->proto.tcp.seen[dir].flags &
        IP_CT_TCP_FLAG_CLOSE_INIT)
    || after(ntohl(th->seq),
             conntrack->proto.tcp.seen[dir].td_end)) {
        /* Attempt to reopen a closed connection.
        * Delete this connection and look up again. */
        write_unlock_bh(&tcp_lock);
        if (del_timer(&conntrack->timeout))
                conntrack->timeout.function((unsigned long)
                                            conntrack);
        return -NF_REPEAT;
} else {
        write_unlock_bh(&tcp_lock);
        if (LOG_INVALID(IPPROTO_TCP))
                nf_log_packet(pf, 0, skb, NULL, NULL,
                              NULL, "nf_ct_tcp: invalid SYN");
        return -NF_ACCEPT;
}
--- cut here ---

It seems that tcp_conntracks allows such (TCP_CONNTRACK_TIME_WAIT -> 
TCP_CONNTRACK_SYN_SENT) transition, pointing to rfc1122:

When a connection is closed actively, it MUST linger in
TIME-WAIT state for a time 2xMSL (Maximum Segment Lifetime).
However, it MAY accept a new SYN from the remote TCP to
reopen the connection directly from TIME-WAIT state, if it:

(1)  assigns its initial sequence number for the new
     connection to be larger than the largest sequence
     number it used on the previous connection incarnation,
     and

(2)  returns to TIME-WAIT state if the SYN turns out to be
     an old duplicate.

So, it seems that this "after(...)" does not match this packet, right?

Patric, what do you think about this?

Best regards,

				Krzysztof Olędzki