Re: Performance with ethernet channel bonding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




I'm not sure what you are describing is the problem I'm seeing. Let me
recap the configuration & ask if reordering SHOULD occur [and trigger the
congestion detection].

  System 1                 System 2
  dual CPU's               dual CPU's
  NIC 1  ---- switch1 ---- NIC 1 (eth0)

  NIC 2  ---- switch2 ---- NIC 2 (eth1)

The "bond0" interface has both eth0 and eth1 used for the connections
between the two systems. The NIC's, cables & switches are independent. Both
switches are private LAN's, no planned activity other than the test except
for typical daemon activity for the systems under test.

The netpipe program basically exchanges data between the two systems,
starting with small block sizes & grows at "interesting" sizes [the default
is basically 2^n +/- 1] and records throughput [bytes per second] and
latency [1/2 the round trip time]. I don't think it uses multiple streams
of data - I will check to be sure.

What I see is
 - the single channel result [prior to channel bonding] is OK, no odd
behavior.
 - the two channel results [with channel bonding] is not OK, severe drop
outs.
That indicates to me that the basic drivers, and switches are sound. That
also indicates to me that the bond0 interface is getting confused. I assume
that the channel bonding code splits the large packets & sends about 1/2
the data on both eth0 and eth1 - results from ifconfig tends to confirm
[transmits on eth0/eth1 add up to transmits on bond0, byte counts are
similar between eth0 and eth1].
 - Would the channel bonding code send the packets out of order [to cause
the problem you describe]?
 - I thought about packet drops as well, perhaps an SMP race condition?
We've noted that the error counts from ifconfig don't increment until a
slight increase at the end of the run, so it's not counting the problems if
they do occur. [had 22 overruns recorded after >1.5M packets transmitted &
received without any overruns! Overruns on one machine only, not both.]

I'll also see if I can get a copy of the latest kernel & see if the problem
recurs & get back on that test [tomorrow?]. Thanks.

--Mark H Johnson
  <mailto:Mark_H_Johnson@raytheon.com>


                                                                                                                    
                    "Andi Kleen"                                                                                    
                    <ak@suse.de>         To:     Mark H Johnson/RTS/Raytheon/US@RTS                                 
                                         cc:     linux-net@vger.kernel.org                                          
                    08/23/00             Subject:     Re: Performance with ethernet channel bonding                 
                    08:29 AM                                                                                        
                                                                                                                    
                                                                                                                    



On Tue, Aug 22, 2000 at 11:15:47AM -0500, Mark_H_Johnson@Raytheon.com
wrote:
> As part of a study for NASA, we ran "netpipe" on both single channel and
> two channel bonded Ethernet networks and got some odd results.
>  - The single channel results looked OK.
>  - The channel bonded results had some serious performance drops.
> I was not able to find a documented problem about this. Is this a known
> problem [with a work around?] or if new, what kind of information will be
> needed to help isolate it? I've included some sample data about our
> configuration and the symptoms below. I will gladly send raw data & more
> detailed configuration information if that will help. Please respond to
me
> directly - I don't subscribe to linux-net & linux-kernel-digest was still
> down the last time I've checked. Thanks.

The performance drops are probably caused by packet reordering. Extensive
packet reordering causes TCP to detect congestion and causes extensive
retransmits and lower congestion windows. Linux 2.4.0test7-preLATEST has
some
sender side improvements to handle reordering in the network better (and
some receiver side hacks that may or more likely may not work)

The best solution is to avoid reordering by not sending more than a single
stream / interface.


-Andi





-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/


[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux