Re: Architecture

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Fri, 26 Jun 2009 15:35:01 +1200

Ronan Lucio wrote:
Hi Amos,

Thank your for the valuable answer.
I'm sorry for the delayed  reply, but I need to first read more about 
CARP and architecture to better digest it.

Now these things seems to be getting clear in my mind.

Amos Jeffries escreveu:
What kinkie means is that the efficiency is determined by the type of 
NAS.

Squid performs a high-churn random-access IO when in full operation. And
needs to do so at the highest possible speed.  The absolute _best_ 
solution
is to have a high speed storage device spindle dedicated to just one
cache_dir in one Squid. None of the SAN/NAS I'm aware of can make that 
kind
of find-grained assignment.

Actually, only the application servers and the third (lighttpd for 
static... if so) would have access to the storage.
Squid servers would read and write cached file localy in a SAS HD.

Almost...

1a) Squid load balancer using carp to select layer-2.

1b) Squid caching servers

then whatever backend you like...

It really seems to be a better choice.
Do you have any idea about how many page hit would handle one squid 
servers?
Thinking about a Dual QuadCore 4Gb RAM serving only small files (less 
than 300 Kb each).

Adrian has mapped Squid 2.7 as far as 800-850 req/sec. Squid-3 is 
untested but expected to be at minimum 500 req/sec (2.5 was capable of 
this around the time of the Squid-3 branch).
NP: that is on single-threaded machines.

We have confirmed data from one Squid-2.7 install doing 2.7K req/sec on 
high-end quad-core hardware a few years back and unconfirmed reports of 
another deployment (unknown version) topping 5K req/sec 1-2 years ago.

Wikimedia are the poster deployment for CARP deployment to around ~200 
Squid last I heard and push dozens of TB per day (their network diagrams 
show 3 load balancing Squid, I'm not sure if thats right though). So 
even if one Squid can't handle it the model scales enormously.
Their case-study has graphs from back when it was 50 Squid show 560+ 
req/sec ...
http://www.squid-cache.org/Library/wikimedia.dyn

Chris Robertson had some numbers a while back, he may chime in with 
something more accurate :)

Given the file sizes you mention, I'd suggest using caching layer Squid 
2.7 with COSS involved as well.

Amos

2) Application server as the backend servers

3) A third server serving static resources

It's up to you whether the system is large enough to require separate (2)
and (3).
For small enough (for some value of small) its sufficient to combine them
and provide caching headers to push most of the static work into the (1b)
Squid.

Hmmm, the doubt remains.
I'm thinking in that third server for two reasons:
- Get out these apache process (~ 30 hits) from the application servers, 
once the goal is to optimize its resources.
- Get a better performance assigning it (images, css's and js's) a 
different sub-domain, that would allow client browser to make 4-6 
parallel requests.

Thank you,
Ronan

It's up to you. I just pint out that the whole caching model is to 
convert generated content into static and server that from Squid caches 
as close to the client as possible. The load on the static server will 
or should be extremely low by comparison to the dynamic servers.

The load reduction on static info servers with long'ish caching times 
starts high and approaches 100%. Which makes a dedicated static content 
server possibly a waste of a good CPU :)

These are dynamic image servers...
http://www.squid-cache.org/Library/flickr.dyn
The 'Layer-7 URL hashing' they mention is CARP.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE16
  Current Beta Squid 3.1.0.8