Search squid archive
Squid vs httpd mod_cache
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
- Subject: Squid vs httpd mod_cache
- From: Neil Gunton <neil@xxxxxxxxxxxx>
- Date: Mon, 24 Nov 2008 11:25:35 -0800
- User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.17) Gecko/20080829 Iceape/1.1.12 (Debian-1.1.12-1)
Hi all,
I'm running a LAMP community website (Debian Lenny, Apache 2.2.9, MySQL,
mod_perl) which gets around 100,000 page requests per day. I currently
use two builds of apache - one lightweight front end caching reverse
proxy, and a heavy back-end mod_perl. This worked well for years while I
was using Apache 1.3, since I was using Igor Sysoev's mod_accel and
mod_deflate modules to do the reverse proxy and caching. Now I have
upgraded to Apache 2.2, I can't use his modules any more, so I've been
trying to use the stock mod_cache. The server is a dual Opteron 265
(i.e. 4 cores), 4GB RAM, 4x10k SCSI drives in RAID0 (I know it's risky,
buy I need the space and performance, and backup is instantaneous with
MySQL replication).
Everything's working fine, mostly, but I'm having some issues with the
cache management. In a nutshell, htcacheclean just doesn't seem to be
able to keep up with managing the cache pruning (i.e. keeping it down to
a reasonable size). If I run htcacheclean in cron mode, then it takes
hours to complete its run, and while running it hogs the disks and
produces big iowait times. If I run it in daemon mode, then it just sits
there and produces about half the iowait (if I run with the -n "nice"
option), in which case it just isn't keeping up with the cache growth.
I'm concerned about the cache structure - it's a 3-level directory, and
it seems to take a long time just to traverse it. Even doing a simple du
on it seems to take forever, currently about 3 hours or more, and that's
for about 10GB of cache. I'd prefer to keep the cache down to more like
1GB at the most. In fact, that's what I have htcacheclean set to -
1000MB. But it doesn't seem to be doing the job.
I've been asking around the Apache and mod_perl lists about ways to
improve this. Someone suggested using Squid instead. So here I am - I've
never used Squid, mostly because I always used Apache and really need
the mod_rewrite capabilities for doing things like blocking image
hotlinking from other sites. I really need a front-end reverse proxy
that has capability to do access control stuff like this, as well as
redirects for old content etc - you know, all the things you can do with
mod_rewrite. I really don't want to have to pass all that back to the
mod_perl processes.
I would like to know how good Squid's cache management (i.e. pruning)
is. I get the impression that mod_cache in Apache 2.2 is not very mature
- some of the cache management features don't even seem to be
implemented yet. I assume that Squid is a much more mature product, and
thus I'd hope that it has cache management pretty much down pat.
How does Squid manage its disk cache? Does it consume a lot of disk io
when doing it?
Has anybody else here migrated from using Apache's mod_cache to Squid,
and if so do you have any insights?
Lastly, if I do decide to use Squid, is the O'Reilly book from 2004
still relevant, or is it out of date now? I know there's a lot of stuff
online, but I like to have a handy book reference, plus a well-written
book often has a good intro to the tool. This book seems to get only
5-star reviews on Amazon. Is it still up to date?
Thanks in advance,
Neil
[Index of Archives]
[Linux Audio Users]
[Samba]
[Big List of Linux Books]
[Linux USB]
[Yosemite News]