Re: Transfer stalls with NAT under 2.6.24.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Patrick McHardy wrote:
Sven Riedel wrote:
Hi,
I've run into a strange problem where large file transfers start stalling over a NATed connection. Packet traces reveal that ACK packets are sometimes not being passed through to the inside (NATed) host, which results in a transfer stall until a tcp timeout occurrs and the other side retransmits the ACK.

This only seems to happen if the conntrack table on the firewall already contains an entry for the same source and destination in TIME_WAIT state. If no conntrack entries exist for the same source and destination, the packets flow fine.

The problem seems to be alevated by setting ip_conntrac_tcp_be_liberal to 1, but this seems to be only a workaround not a real solution.

Scatter gather and tcp segment offloading have been disabled in the relevant NICs on the firewall during debugging, to make sure this isn't a hardware issue.

Is this issue known/is there a patch available or would further information be needed to help debug the problem?

2.6.24.3 includes a patches that was supposed to fix problems
with connections in TIME_WAIT state. Does 2.6.24.2 work better
for you?

The firewall system in question is currently productive. I _might_ be able to try the other kernel tomorrow morning. Once I am able to try it I'll let you know.


Please enable conntrack logging for TCP by executing:

echo 6 >/proc/sys/net/netfilter/nf_conntrack_log_invalid

and check whether you get any messages in the ring buffer.

Yep, lots ;)

In the following 100.100.100.100 is the external machine and 200.200.200.200 is the NAT IP-Address on the firewall. A 5MB file was transferred via scp to 100.100.100.100 from the internal network.

The output during a "clean" run, with an empty conntrack table and no stalls:
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42121
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E351B138AA40101050AE50974FBE5097A53)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42122
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E361B138AA40101050AE50974FBE5097FAB)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42123
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E361B138AA40101050AE50974FBE5098503)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42124
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E371B138AA40101050AE50974FBE5098A5B)
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42125
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720612 ACK=3828427355 WINDOW=47880
RES=0x00 ACK URGP=0 OPT (0101080A45585E381B138AA40101050AE50974FBE5098FB3)
printk: 24 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42248
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355720852 ACK=3828837755 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A45585F911B138E140101050AE50FB2C3E50FD2D3)
printk: 31 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42465
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355721284 ACK=3829614779 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455861861B1392E10101050AE51B935BE51BB8C3)
printk: 25 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42718
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355721716 ACK=3830353499 WINDOW=42408
RES=0x00 ACK URGP=0 OPT (0101080A455863DA1B1398B70101050AE526E3ABE526E903)
printk: 57 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=42976
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355722052 ACK=3830954051 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455865791B139CBE0101050AE52FFD8BE530284B)
printk: 27 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=72 TOS=0x00 PREC=0x00 TTL=56 ID=43306
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355722580 ACK=3831787163 WINDOW=42408
RES=0x00 ACK URGP=0 OPT
(0101080A455867731B13A19501010512E53CCB53E53CD653E53CBE93E53CC3EB)
printk: 74 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=43789
DF PROTO=TCP SPT=22 DPT=43021 SEQ=355723252 ACK=3832978011 WINDOW=42408
RES=0x00 ACK URGP=0 OPT (0101080A45586A571B13A8CF0101050AE54EDEABE54EE403)







During a run with stalls:

nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=80 TOS=0x00 PREC=0x00 TTL=56 ID=44105
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596614326 WINDOW=49248
RES=0x00 ACK URGP=0 OPT
(0101080A4558793C1B13CE350101051A491E8751491E8CA9491E7B71491E81F9491E40A9491E5B61)

^^^^ Transfer stalled here for ~10 seconds.


printk: 22 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=72 TOS=0x00 PREC=0x00 TTL=56 ID=44113
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596632110 WINDOW=49248
RES=0x00 ACK URGP=0 OPT
(0101080A45587D301B13D81801010512491E8751491E8CA9491E7B71491E81F9)
printk: 12 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=44114
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160349927 ACK=596635150 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455881B21B13E35A0101050A491E8751491E8CA9)
printk: 14 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=7320
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160350311 ACK=597280038 WINDOW=27360
RES=0x00 ACK URGP=0 OPT (0101080A455883D31B13E8820101050A49286E71492873C9)
printk: 32 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=7451
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160350503 ACK=597578342 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455885161B13EBD30101050A492CEBA9492CF659)
printk: 35 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=7786
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160350983 ACK=598415558 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A455887081B13F0890101050A4939B2094939E221)
printk: 54 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=8021
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160351319 ACK=598980542 WINDOW=42408
RES=0x00 ACK URGP=0 OPT (0101080A455889151B13F5C00101050A4942510149425659)
printk: 43 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=8205
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160351559 ACK=599403254 WINDOW=49248
RES=0x00 ACK URGP=0 OPT (0101080A45588B011B13FA9B0101050A4948C4394948C991)
printk: 40 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=8531
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160352039 ACK=600218582 WINDOW=45144
RES=0x00 ACK URGP=0 OPT (0101080A45588D371B1400160101050A4955351949553A71)
printk: 49 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=8871
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160352519 ACK=601058534 WINDOW=38304
RES=0x00 ACK URGP=0 OPT (0101080A45588F521B1405500101050A4962062949620B81)
printk: 45 messages suppressed.
nf_ct_tcp: ACK is over the upper bound (ACKed data not seen yet) IN= OUT=
SRC=100.100.100.100 DST=200.200.200.200 LEN=64 TOS=0x00 PREC=0x00 TTL=56 ID=8988
DF PROTO=TCP SPT=22 DPT=35858 SEQ=4160352663 ACK=601307510 WINDOW=41040
RES=0x00 ACK URGP=0 OPT (0101080A4558910A1B1409AB0101050A4965D2B94965D811)


Regards,
Sven
--
sven.riedel@xxxxxxxxxxxx

SecureNet GmbH
Intranet & Internet Solutions
Frankfurter Ring 193a
D-80807 München
Tel: +49 89 32133-632
Fax: +49 89 32133-699
Zentrale: -600
www.securenet.de

Sitz der Gesellschaft: München
HRB München 118876
Geschäftsführer: Thomas Schreiber

--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Netem]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux