Re: Strange http client/MTU problem under linux

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 31, 2008 at 9:39 AM, Pekka Savola <pekkas@xxxxxxxxxx> wrote:
> I got a little bit interested in this, so below are a few pointers
> to how to continue with investigation.

Apologies for my delay in replying. I'm still desperate to find out
what's going on, however, so here's the next installment...!

> On Wed, 30 Jul 2008, j t wrote:
>>
>> Here are the results of the 2 boundary-case pings:
>>
>> $ ping -s 1472 -M do ocp.com.com
>> PING c18-ad-xw-lb.cnet.com (216.239.122.193) 1472(1500) bytes of data.
>> 64 bytes from c18-ad-xw-lb.cnet.com (216.239.122.193): icmp_seq=1
>> ttl=241 (truncated)
>
> Truncated means that you got less data than you expected.  Here you're
> requesting 1472B but you're only getting 56+8B.
>
> You should try to figure out where this is disappearing.  Can you do a ping
> -s 1472 without "truncated" with other sites internet? This will give you
> clues whether the issue is at your end or the destination network end.

Hmmm. Even this bit is wierd:

In 1 group, there are hosts such as ocp.com.com (my original problem
host), www.google.com and www.cnet.com
Whenever I ping these hosts and specify a packetsize greater than 56
bytes, the results get truncated:

$ ping -s 56 -n ocp.com.com
PING c18-ad-xw-lb.cnet.com (216.239.122.193) 56(84) bytes of data.
64 bytes from 216.239.122.193: icmp_seq=1 ttl=241 time=132 ms
$ ping -s 56 -n www.google.com
PING www.l.google.com (66.249.91.99) 56(84) bytes of data.
64 bytes from 66.249.91.99: icmp_seq=1 ttl=247 time=26.0 ms
$ ping -s 56 -n www.cnet.com
PING c18-rb-tron-ssa-xw-split-lb.cnet.com (216.239.122.142) 56(84)
bytes of data.
64 bytes from 216.239.122.142: icmp_seq=1 ttl=241 time=124 ms

but
$ ping -s 57 -n ocp.com.com
PING c18-ad-xw-lb.cnet.com (216.239.122.193) 57(85) bytes of data.
64 bytes from 216.239.122.193: icmp_seq=1 ttl=241 (truncated)
$ ping -s 57 -n www.google.com
PING www.l.google.com (66.249.91.103) 57(85) bytes of data.
64 bytes from 66.249.91.103: icmp_seq=1 ttl=247 (truncated)
$ ping -s 57 -n www.cnet.com
PING c18-rb-tron-ssa-xw-split-lb.cnet.com (216.239.122.142) 57(85)
bytes of data.
64 bytes from 216.239.122.142: icmp_seq=1 ttl=241 (truncated)

For each of these hosts (ocp.com.com, www.google.com & www.cnet.com),
I can ping them with sizes up to 1472 (and I get truncated results)
but if I increase the packetsize to 1473, I receive no replies at all:

$ ping -s 1472 -n www.google.com
PING www.l.google.com (66.249.91.147) 1472(1500) bytes of data.
64 bytes from 66.249.91.147: icmp_seq=1 ttl=247 (truncated)
64 bytes from 66.249.91.147: icmp_seq=2 ttl=247 (truncated)
--- www.l.google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 39.231/39.264/39.298/0.200 ms

$ ping -s 1473 -n www.google.com
PING www.l.google.com (66.249.91.104) 1473(1501) bytes of data.
--- www.l.google.com ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2008ms




In the other group are hosts such as slashdot.org, www.debian.org,
www.redhat.com. I can ping these guys with packetsizes of 2000 bytes
with no truncation:

$ ping -s 2000 -n slashdot.org
PING slashdot.org (216.34.181.45) 2000(2028) bytes of data.
2008 bytes from 216.34.181.45: icmp_seq=1 ttl=242 time=135 ms
$ ping -s 2000 -n www.debian.org
PING www.debian.org (194.109.137.218) 2000(2028) bytes of data.
2008 bytes from 194.109.137.218: icmp_seq=1 ttl=57 time=51.4 ms
$ ping -s 2000 -n www.redhat.com
PING e86.b.akamaiedge.net (88.221.176.112) 2000(2028) bytes of data.
2008 bytes from 88.221.176.112: icmp_seq=1 ttl=60 time=36.5 ms

I'll run lft against the 1st group to try to find a list of routers
between me and them and reply back with more info. Surely once I've
tested and mapped out enough hosts, I should be able to figure out
where the problem is... :-(



> My suspicion is that the load-balancer at ocp.com.com has interesting ICMP
> implementation that even if you ping it with big packets, it replies with
> small packets, and you can't figure out MTU issues like this.
>
>> $ ping -s 1473 -M do ocp.com.com
>> PING c18-ad-xw-lb.cnet.com (216.239.122.193) 1473(1501) bytes of data.
>>>
>>> From t60jt (192.168.0.3) icmp_seq=1 Frag needed and DF set (mtu = 1500)
>>> From t60jt (192.168.0.3) icmp_seq=1 Frag needed and DF set (mtu = 1500)
>>> From t60jt (192.168.0.3) icmp_seq=1 Frag needed and DF set (mtu = 1500)
>>> From t60jt (192.168.0.3) icmp_seq=1 Frag needed and DF set (mtu = 1500)
>>
>> If I am correct, success with "-s 1472" means that an mtu of 1500
>> should work (i.e. lowering the mtu down to 1499 should not be
>> necessary). Consequently, I don't want to drop the mtu down to 1499 if
>> that will simply mask/cover a bigger problem.
>
> Note that you're getting this ICMP message apparently from a local network
> and it doesn't prove much in and of itself.

Another quick question: do you say that I'm "getting this ICMP message
apparently from a local network" because it says "From t60jt
(192.168.0.3)" in the lines above? If so, what's the relevance - I
ask, since t60jt is _my_ machine (the box I'm sitting in front of)!

>
> As for your questions:
>>
>> a) Dropping the mtu down to 1499 doesn't tell me why wget works under
>> windows (without the need to drop the mtu).
>>
>> b) Dropping the mtu down to 1499 doesn't tell me why wget (under
>> linux) works if I force my router to grab a public-facing ip address
>> in the range 93.96.x.x.
>>
>> c) Dropping the mtu down to 1499 doesn't agree with/explain the
>> results of the ping testing, which follows...
>
> If you want to figure this out, I think you'll need to run tcpdump on the
> host (both windows and linux) and compare the TCP streams as they seem to
> you.  Specifically I'd look for the MSS negotiated size, whether one uses
> fragments and one doesn't, and used TCP options.
> (Even better would be doing a few tests to another host in internet, which
> is also running tcpdump.  This would show if your ISP is modifying any
> packets.)

I'll try running tcpdump on my (openwrt) internet gateway / router /
ipmasq box this week and report back with the results.

Thanks again for all comments / replies / input.... Jaime :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux