> Given your numbers of 8000 cps and the above comments it would seem that we are well > within any types of overload issues with any decent off the shelf > server equipped with two dual core CPUs and the necessary memory. > If I allocate 500 bytes per connection at the max connections I would > need ~87Mb + machine overhead. That's not much in today's world of > servers. I would say so, unless someone on the list says NAT has completely different performance requirements from the connection tracking only machines. But I did do some tests to find the breaking points of such machines some time ago (see below) and there should be plenty of resources left for any additional NAT requirements, given your numbers. As for memory, we are using 4GB RAM on our high performance machines access throughput/latency is important here) with a 2GB Kernel / 2G Userspace-Setup in order to allow for huge firewall rulesets and to have Linx user larger default sizes for various network caches (without us having to fiddle with the setting). Thomas === old test results ==== [..] What we did is run a system A) CPU Intel® XEON(TM) E3110 3000MHz 6MB FSB1333 S775 2x RAM DDR2 2GB PC667 Kingston ECC NET INTEL Pro1000PT 1GBit 2xRJ45 NIC Dual Server MBI SuperMicro X7SBi Intel® 3210 + ICH9R Chipset. Intel® 82573V + Intel® 82573L PCI-E Gigabit Controllers against system B) CPU AMD Opteron 2220 2,8 GHz DualCore Socket F 4XRAM DDr2 1GB PC667 Kingston ECC-Reg CL5 with Parity Dual Rank + 2 DDR2 1GB / ECC / CL5 / 667MHz / with Parity / Dual Rank NET INTEL Pro1000PT 1GBit 2xRJ45 NIC Dual Server MBA Tyan Thunder h2000m (S3992G3NR-RS) DUAL SKT F EATX I was running pktgen both with generating a single 64byte/packet UDP-Stream and with 8192 parallel flows of flowlen 4 with randomization of dst/src ips and ports (also UPD 64byte/packet) so that the number of conntrack entries stabilized at almost 512k (most of them timing out of course). The result was that the Opteron-System is essential as fast as the Xeon-System if you have just a single flow, but for the second, more realistic test case, the Xeon-System was faster by about 10-20%, probably due to the much larger CPU-Cache. RX/TX flow control was enabled, iptables and connection tracking were loaded. Incoming and outgoing interface had their smp_affinity set to single CPU-Core each. Kernel was 2.6.23.14, e1000-drivers version what was current in Feb 2008. As a ruleset, I did have 2 chaintrees for 8192 IPs each, for ingress and egress, each IP had 10 non-matching rules associated with it, but this ruleset was only search for --state NEW of course... resulting in about 13*2=26 chain jumps and (13+10)*2=46 matches per NEW packet. (I had ~32k chains and ~210k rules) Unfortunately I only have the results for the Xeon-System, the Opteron-Data got lost somehow ;-( 1 stream / default buffers eth0:eth1 735kpps 500k streams / default buffers eth0:eth1 254kpps But those numbers are obviously not comparable to yours... so... [..] ============= snip ======
Attachment:
smime.p7s
Description: S/MIME cryptographic signature