Re: Change from Oracle WC to Squid?

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Wed, 10 Mar 2010 02:01:10 +1300

H.Päiväniemi wrote:
Hi all,

We are currently using Oracle Web cache on quite a big site. We have 18 cache servers and caching is done by 
reverse-proxy method. There is appr. 50000 http req/s on our cache servers together. System is running Linux.

Because Oracle licenses are quite expensive, we would like to know if Squid could replace OWC on our site.

I believe Squid should be able to step into the job of any HTTP proxy 
out there. Provided you are willing to undergo the drop in raw speed 
from that which a specialized proxy can offer.
 50,000 req/sec is beyond a single Squid AFAIK (love to be proven wrong 
:). It may require 20 or so Squid instances tunned to high speed. How 
many current caches is that load spread over currently?

Looking at the docs on OWC it appears that you may be in the unfortunate 
position of needing both high speed performance and ESI capability.
Squid-2.7 can provide the speed. Squid-3.x can provide ESI (without the 
proprietary Oracle extensions) but not yet as fast as Squid-2.7 (3.0 is 
a major step down in speed, 3.1 only minor but still down).

I would be _very_ pleased if any of you could gimme comments and thougts about this. My main consern is if Squid 
is or is not able to work in this kind of situation where we are using:

- multiple ip-based virtual hosts (about 50)

Yes.
 Domain based are far easier to manage though.

- multiple origin server pools of course

Yes.

- origin server polling to see if they are ok

Yes. Via several methods: ICMP, passive data from replies, and active 
URL monitoring.

- both common and site-specific cache-rules

Yes Squid has a configuration file that controls it's behaviour.
 Or did you mean something different?

- different kind of expiration policies

Depends on what you are expiring. HTTP reply objects can be expired 
under origin server control, or with heuristic algorithms selected on 
URL patterns.

- cookie-based and header-based personalization to cache multiple versions of some sites

According to HTTP standards, then yes. Cookie headers are usable, but 
get polluted easily and so considered a bad choice for determiner 
(origin created ETag is better).

- origin server capacity throttling / limiting etc

Indirectly.
 Squid can throttle client output speeds based on origin (delay pools), 
or send QoS markings for kernel networking to throttle the Squid->Origin 
connections.

- caching both GET, Get with query strings and POST

GET yes natively.

Stuff with query strings, yes after maybe a minor config change.

POST no.  It's not possible for Squid to know and perform the POST 
handling actions itself. The POST will always be sent back to the origin 
server. The origin server will always reply with whatever result the 
POST generates.

NP: Pages sitting at the other end of 3xx replies from POST can be 
cached however. Since they are really at the end of a followup GET.

- have to have some kind of method to order caches to expire (clear) specific content immediately when asked

Yes. Both HTTP PURGE and HTCP CLR requests

So how is it - can Squid be used or are we gonna have major problems or limitations?

Yes, I have seen a few people talk about it. They had some minor config 
issues. We tend not to hear from anyone who set it up without trouble.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE24
  Current Beta Squid 3.1.0.17