Search squid archive

Squid vs httpd mod_cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I'm running a LAMP community website (Debian Lenny, Apache 2.2.9, MySQL, mod_perl) which gets around 100,000 page requests per day. I currently use two builds of apache - one lightweight front end caching reverse proxy, and a heavy back-end mod_perl. This worked well for years while I was using Apache 1.3, since I was using Igor Sysoev's mod_accel and mod_deflate modules to do the reverse proxy and caching. Now I have upgraded to Apache 2.2, I can't use his modules any more, so I've been trying to use the stock mod_cache. The server is a dual Opteron 265 (i.e. 4 cores), 4GB RAM, 4x10k SCSI drives in RAID0 (I know it's risky, buy I need the space and performance, and backup is instantaneous with MySQL replication).

Everything's working fine, mostly, but I'm having some issues with the cache management. In a nutshell, htcacheclean just doesn't seem to be able to keep up with managing the cache pruning (i.e. keeping it down to a reasonable size). If I run htcacheclean in cron mode, then it takes hours to complete its run, and while running it hogs the disks and produces big iowait times. If I run it in daemon mode, then it just sits there and produces about half the iowait (if I run with the -n "nice" option), in which case it just isn't keeping up with the cache growth.

I'm concerned about the cache structure - it's a 3-level directory, and it seems to take a long time just to traverse it. Even doing a simple du on it seems to take forever, currently about 3 hours or more, and that's for about 10GB of cache. I'd prefer to keep the cache down to more like 1GB at the most. In fact, that's what I have htcacheclean set to - 1000MB. But it doesn't seem to be doing the job.

I've been asking around the Apache and mod_perl lists about ways to improve this. Someone suggested using Squid instead. So here I am - I've never used Squid, mostly because I always used Apache and really need the mod_rewrite capabilities for doing things like blocking image hotlinking from other sites. I really need a front-end reverse proxy that has capability to do access control stuff like this, as well as redirects for old content etc - you know, all the things you can do with mod_rewrite. I really don't want to have to pass all that back to the mod_perl processes.

I would like to know how good Squid's cache management (i.e. pruning) is. I get the impression that mod_cache in Apache 2.2 is not very mature - some of the cache management features don't even seem to be implemented yet. I assume that Squid is a much more mature product, and thus I'd hope that it has cache management pretty much down pat.

How does Squid manage its disk cache? Does it consume a lot of disk io when doing it?

Has anybody else here migrated from using Apache's mod_cache to Squid, and if so do you have any insights?

Lastly, if I do decide to use Squid, is the O'Reilly book from 2004 still relevant, or is it out of date now? I know there's a lot of stuff online, but I like to have a handy book reference, plus a well-written book often has a good intro to the tool. This book seems to get only 5-star reviews on Amazon. Is it still up to date?

Thanks in advance,

Neil

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux