Re: NAT stops forwarding ACKs after PMTU discovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I would say, the problem is due to a sequence-number rewriting
middlebox, who does not correctly handle the SACK-blocks.

In remote.pcap, you see:
Packet#10: A Dup-ACK: ACK 1005503816, SACK: 1005505184-1005505648
The SACK actually covers Packet#9

In tun0.pcap, you have:
Packet#9: The one that is SACK'ed: SEQ: 3514869772
Packet#11: The Dup-ACK: ACK: 3514868404, SACK: 3570452498-3570452962

You can see that the SACK-block is not really aligned with the ACK-numbers.

Netfilter is probably dropping the Dup-ACK, because the SACK-block is
acknowledging unseen data.


There are middleboxes out there that randomize the sequence numbers, due to
an old bug in Windows, where the initial sequence number was not
sufficiently randomized. There are some of these middleboxes, who simply do
not support SACK, thus modify only the sequence numbers of the TCP-header
(https://supportforums.cisco.com/docs/DOC-12668#TCP_Sequence_Number_Randomization_and_SACK).

This results in a TCP retransmission timeout on the sender, because of
the invalid SACK-blocks, the duplicate ACKs are not accounted. This
completly kills the performance, as you can see in our presentation given at
IETF87: http://tools.ietf.org/agenda/87/slides/slides-87-tcpm-11.pdf

We have a patch that accounts DUP-ACKs with invalid SACK-blocks effectively
as duplicate acknowledgments. I can send the patches, if the
netdev-community is interested in accepting these upstream.


Cheers,
Christoph



On 19/08/13 - 01:43:18, Corey Hickey wrote:
> On 2013-08-18 17:03, Eric Dumazet wrote:
> > On Sun, 2013-08-18 at 17:00 -0700, Eric Dumazet wrote:
> > 
> >> Code like this seems very suspect to me :
> >>
> >> before(sack, receiver->td_end + 1)
> >>
> > 
> > My suggestion would be to try :
> > 
> > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
> 
> [...]
> 
> Thanks for all your suggestions--I really wasn't expecting so much on a
> weekend. Here's all the data I have for tonight.
> 
> 
> I tried the linux-next kernel, then linux-next with your patch applied.
> Neither of them fix the problem, unfortunately. I have taken tcpdumps
> for a working SSH and a failing SSH.
> 
> http://fatooh.org/files/tmp/linux-next-patch1.tar.bz2
> 
> [localhost]
> sudo tcpdump -ni br0 -s 0 -w /tmp/local.pcap 'host 10.15.24.13 or icmp'
> 
> [router eth0]
> tcpdump -ni eth0 -s 0 -w /tmp/eth0.pcap \
>     'host 10.15.24.13 or (icmp and host not 69.78.33.132)'
> * the exclusion here is just to remove some unrelated clutter
> 
> [router tun0]
> tcpdump -ni tun0 -s 0 -w /tmp/tun0.pcap -s 0 'host 10.15.24.13 or icmp'
> 
> [remote]
> tcpdump -ni eth0 -s 0 -w remote.pcap 'host 192.168.61.56'
> 
> 
> Some notes:
> 
> 1. I tested the new kernels only on the Linux router, assuming that is
> where it was intended.
> 
> 2. I take back what I wrote earlier about every connection that involves
> PMTU discovery failed; I may have been observing this wrong. For now,
> the situation is that some connections stop forwarding packets from the
> remote host immediately after the retransmit, while other work fine.
> 
> 3. From local.pcap, you can see that my localhost doesn't actually
> transmit a large packet, yet the router's eth0 sees a large packet come
> in. I think this is due to TSO, but I'm not completely sure.
> 
> 4. For some reason, I cannot reproduce this when SSHing to a host at
> work that is running Debian sid with 3.10-1-amd64, but I can reproduce
> it when SSHing to hosts running Centos 6.4 with
> 2.6.32-358.6.1.el6.x86_64 (which surely has a ton of patches applied,
> for whatever that's worth).
> 
> 5. I have only a vague understanding of SACK; I will be reading up on
> this soon. I will also look into packetdrill for reproducing the
> problem, if the SSH results aren't good enough.
> 
> 6. If I reduce the MTU on localhost to match the path MTU, the problem
> does go away.
> 
> Thanks again for all the help,
> Corey
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Netfitler Users]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux