Hi Steven, >You are using ebtables, so that adds a lot of overhead processing the rules. The problem is that each packet means a CPU cache miss. What is the memory bus bandwidth of the Xeon's? I'll re-run that oprofile. The last set of tests I did was with ebtables disabled, and it was still dropping packets. Ultimately, however, I need ebtables (and tc) running. Memory is DDR333. Having installed irqbalance as you suggested, initial tests look promising.... Leigh. -----Original Message----- From: Stephen Hemminger [mailto:shemminger@xxxxxxxxxxxxxxxxxxxx] Sent: Wednesday, 14 November 2007 9:47 AM To: Leigh Sharpe Cc: bridge@xxxxxxxxxxxxxxxxxxxxxxxxxx Subject: Re: Rx Buffer sizes on e1000 On Wed, 14 Nov 2007 09:24:18 +1100 "Leigh Sharpe" <lsharpe@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >First, make sure you have enough bus bandwidth! > > Shouldn't a PCI bus be up to it? IIRC, PCI has a bus speed of 133MB/s. > I'm only doing 100Mb/s of traffic, less than 1/8 of the bus speed. I > don't have a PCI-X machine I can test this on at the moment. I find regular PCI bus (32bit) tops out at about 600 Mbits/sec on most machines. For PCI-X (64 bit/133) a realistic value is 6 Gbits/sec. The problem is arbitration and transfer sizes. Absolute limit is: PCI32 33MHz = 133MB/s PCI32 66MHz = 266MB/s PCI64 33MHz = 266MB/s PCI64 66MHz = 533MB/s PCI-X 133MHz = 1066MB/s That means for for normal PCI32, one gigabit card or 6 100Mbit Ethernet interfaces can saturate the bus. Also, all that I/O slows down the CPU and memory interface. > >Don't use kernel irq balancing, user space irqbalance daemon is smart > > I'll try that. > > >It would be useful to see what the kernel profiling (oprofile) shows. > > Abridged version as follows: > > CPU: P4 / Xeon, speed 2400.36 MHz (estimated) > Counted GLOBAL_POWER_EVENTS events (time during which processor is not > stopped) with a unit mask of 0x01 (mandatory) count 100000 > GLOBAL_POWER_E...| > samples| %| > ------------------ > 65889602 40.3276 e1000 > 54306736 33.2383 ebtables > 26076156 15.9598 vmlinux > 4490657 2.7485 bridge > 2532733 1.5502 sch_cbq > 2411378 1.4759 libnetsnmp.so.9.0.1 > 2120668 1.2979 ide_core > 1391944 0.8519 oprofiled > You are using ebtables, so that adds a lot of overhead processing the rules. The problem is that each packet means a CPU cache miss. What is the memory bus bandwidth of the Xeon's? > -------------------------- > (There's more, naturally, but I doubt it's very useful.) > > > >How are you measuring CPU utilization? > > As reported by 'top'. > > >Andrew Morton wrote a cyclesoaker to do this, if you want it, I'll dig > it up. > > Please. > > >And the dual-port e1000's add a layer of PCI bridge that also hurts > latency/bandwidth. > > I need bypass-cards in this particular application, so I don't have much > choice in the matter. > > Thanks, > Leigh -- Stephen Hemminger <shemminger@xxxxxxxxxxxxxxxxxxxx> _______________________________________________ Bridge mailing list Bridge@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/bridge