currently we are doing about 15Mb/s upstream on a 100Mb line.
Upgrading this to 1Gbit if we need it won't be problem.
there are 2 reasons that i want to have each squid server perform optimal:
- failover, i want to be able to run on 1/2 squidservers
- the growthfactor is large (member number went times 10, and the
pageviews times 3 in the last 6 months)
guess i overestemated the avarage file size, a second look learns me that
the images that you see by far most on the website (thumbnails) rang
from 0.5KB to 2 KB,
the next size is 40Kb, which is acessed alot less.
so i guess the avarage filesize of the files served taking into account
there frequency. would be more towards 10KB maybe even less.
That's better, though with such a large growth rate you will need to
anticipate network bottlenecks far ahead, and be ready to switch to
gigabit on the squids or grow the number of squids, whichever one is
cheaper. Set up MRTG or something equivalent to keep an eye on this.
Squid has a built in SNMP server that can produce some useful graphs
though MRTG.
Another method would be CARP. I haven't used it myself, but it's used
to split the load between peers based on URL. Basically a hash based
load balancing algorithm.
<cut from carp manual>
When the hosts receive an ARP request for 192.168.1.10, they both
select
one of the virtual hosts based on the source IP address in the request.
The host which is master of that virtual host will reply to the
request,
the other will ignore it.
</carp>
i think that loadbalancing is based on source ip, instead of url.
so carp wouldnt be an option.
Is that the same CARP I was looking at?
http://squid-docs.sourceforge.net/latest/html/x2398.html
If you have a load balancer with packet inspection capabilities you
can also direct traffic that way. On F5 BigIPs the facility is called
iRules. I'm pretty sure NetScaler can do that too.
That is the kinda solution iam looking for, but then without the cost we
are pretty new company without the money to buy expensive solutions. so
we prefer open source solutions.
another point:
what is your experience with ext2/3 reiserfs?
our ext3 partitions tend to get corrupted, when used for squid caches or
simular purposes.
i tend to change things to reiserfs entirely, but its just a guess.
does anyone have the same experience?
Read the flames on the LKML about ReiserFS and decide if it's stable for
production use ;-)
I've got six squids handling a similar traffic load to what you describe
(though on a smaller working set) on ext3 with no corruption issues.
No corruption issues on any other server using ext3 either. Looks like
you have a serious issue to fix there.
--
Robert Borkowski