Multilink PPPoE - Header corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Searching the archives, I see that this list has been relatively quiet as of late - probably because the PPP systems are considered stable and "just work"... However I've hit an interesting issue that I can't find any solution to..

I have a Linux based router - nothing fancy, however it has 8 Intel Ethernet ports - 4 x Gb, 4 x 10/100. It's running Debian, with a custom compiled 2.6.35.13 kernel tailored to the hardware. (Although I've just now tried 3.0.4 with the same issues)

(using that 2.6.35 kernel as it's the latest flagged for long-term support and I'm using it elsewhere in routers/servers, etc. I'm not averse to trying a different kernel though - especially if someone can say "Oh look, a regression, try this" or something)

Anyway, one of the 10/100 Ethernet ports is running PPPoE to a Vigor 120 modem acting as a PPPoE <> PPPoA bridge to the ISP. This is working fine and is something I've done several times in the past.

Another 2 ports connect in a similar manner via a pair of Vigor 120's to a different ISP who is providing a bonded service.

I can bring each of these lines up individually and pass data over them without any issues. I can bring them both up, and I see the links bundled together in the log messages, however then the fun begins... I can ping the far end (and remote hosts in-general), but I can't transfer any data. In particular what appears to be happening is that the first SYN packet that leaves the router is corrupted, or the SYN+ACK packet is response to a remote connection is similarly corrupted.

e.g. tshark looking at the Ethernet port connecting the router to the modem when I try to connect in remotely (ssh - 195.10.225.68 is my remote host, 93.89.81.142 is the router)

  7.494318 195.10.225.68 -> 93.89.81.142 TCP 40323 > 22
	 [SYN] Seq=0 Win=5840 Len=0 TSV=133234063 TSER=0 MSS=1446 WS=6

The router then sends the syn+ack back... or should have, but sends this:

  7.498888 93.89.81.142 -> 195.10.225.68 TCP 22 > 40323[Malformed Packet]

the far-end also sees a zero-length header message.

The link does work in some way though - e.g. if I bring up one link, establish a connection - e.g. an scp of a big file, then I bring up the 2nd line it carries on working - at double the speed - which is to be expected - but if I then try a new connection, the syn/syn+ack packet will be corrupted as before.

I've tried the various firewalling, mtu and mss clamping on and off, particular values, etc. to no avail. I've also tried the kernel PPPoE driver as well as the RP userspace one too.

So it's quite frustrating!

A quick glance at the code suggests no changes since about 2005, which sort of implys that many others are using it successfully, or no-one is and I'm the first to try for some years (which I don't believe!)

The ISP doing the bonding line has other customers doing the same, but I'm not convinced it's an issue with them - seeing that corrupted packet come out of my router sort of points the finger right at it from what I reckon.

If anyone has any clues, suggestions or ideas, I'm all ears! I may not be able to try out much as the site is generally "live" with people using the initial link to the first ISP on a daily basis, so any tests/experiments may have to wait until an evening or weekend, but at this stage I'm willing to try almost anything...

Cheers,

Gordon
(Based in the UK if any makes any difference)
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Audio Users]     [Linux for Hams]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Fedora Users]

  Powered by Linux