On Wed, Jun 13, 2007 at 09:33:19AM -0300, Michel Santos wrote: > > Dave Dykstra disse na ultima mensagem: > > Hi, > > > > I wanted more throughput for my application than I was able to get with > > one gigabit connection, so we have put in place a bonded interface with > > two one-gigabit connections agregated into one two-gigabit connection. > > Unfortunately, with one squid, re-using objects that are small enough to > > fit into the Linux filesystem cache but large enough to be efficent (a > > few megabytes each), it maxes out a CPU core at around 140MB/s. This is > > a dual dual-core AMD Opteron 270 (2Ghz) machine, so it is natural to > > want to take advantage of another CPU. (This is a 64-bit 2.6.9 Linux > > kernel and I think I have squeezed about all I am going to out of the > > software). At first I tried running two squids separately on the two > > different interfaces (without bonding, 2 separate IP addresses) but that > > confused the Cisco Service Load Balancer (SLB) we're using to share the > > load & availability with another machine so I had to drop that idea. > > For much the same reason, I don't want to use to two different ports. > > So then the problem is how to distribute the load coming in on the one > > IP address & port to two different squids. Two different processes > > can't open the same address & port on Linux, but one process can open a > > socket and pass it to two forked children. So, I have modified > > squid2.6STABLE13 to accept a command line option with a file descriptor > > of an open socket to use instead of opening its own socket. I then > > wrote a small perl script to open the socket and fork/exec the two > > squids. This is working and I am now getting around 230MB/s throughput > > according to the squid SNMP statistics. > > > > I use another approach. I run three squids. using two on 127.0.0.2 and .3 > which serve as parent. So the Ip address which contacts the remote sites > is the IP address of the server. I get very high performance and the setup > is easy without helper programs. So everything is filtered through one squid? I would think that would be a bottleneck. Does it not actually cache anything itself, just pass objects through with proxy-only? > What was important to me that I can > sibling both in order not getting objects cached twice So you have the two backend squids set as proxy-only too? I would think that half of your objects would then go through all 3 squids! We have a pair of machines for availability & scaling purposes and I wanted them to be siblings so they wouldn't both have to contact the origin server for the same object. The problem with cache_peer siblings is that once an item is in a cache but expired, squid will no longer contact its siblings. In my situation objects currently expire frequently so having siblings was pretty much useless (as I like to describe it -- like the elevators in the 40-floor Wayside School, one that goes up and one that goes down, which both worked perfectly -- once). We had instead configured them as cache_peer parents to avoid that problem, but when we turned on collapsed_forwarding in squid2.6 that brought us to some deadlock situations. We decided that collapsed_forwarding would result in fewer contacts to the origin server than peering, so we ended up disabling peering altogether. > I am curious if you can get higher throuput using sockets or tcp over > loopbacks. What kind of throughput do you get with your arrangement? - Dave