A couple lessons learned from my end, both in my own experience and
picked up from various squid-users threads...
I've said this before, but never underestimate the value of kernel
page cache. If you need to scale the box, put in as much RAM as you
can afford.
Also, as has been said before, squid + RAID = PAIN (particularly
RAID5). Performance will be much better if you can set up multiple
physical disks under separate cache_dirs, thus allowing async reads to
take place in parallel. If disk redundancy is a must, stick with RAID
1 pairs (multiple RAID 1 pairs work well, particularly with a hardware
controller).
If your traffic load is mostly small ( ~ < 1 MB ) objects, consider
utilizing COSS storage as an alternative to AUFS; this will give you
much more bang for the buck if you're serving large numbers of small
objects, since it eliminates the overhead of the millions of of open()/
close() kernel system calls you'd see with AUFS.
If you find yourself hitting the single-core CPU bottleneck due to
squid's main loop, it is possible to run multiple squids on a box,
although each one requires its own cache storage. If you need to move
to this, consider configuring one of more "front-end" squids that
refer queries to multiple "back-end" parent caches via CARP to
eliminate duplicating object storage.
HTH,
-Chris
On May 19, 2009, at 8:47 AM, rihad wrote:
Jeff Pang wrote:
rihad:
But what about Posix threads & Async IO? (./configure --enable-
async-io=2 ...)? Don't they take advantage of multiple CPUs/cores/
cache_dirs?
Yes Async-IO benefits from multi-cpu on disk IO, if you're using it.
Squid's main daemon is a single process, that benefits nothing from
SMP system.
Since disk I/O is often the bottleneck (given enough RAM), it can be
said that, thanks to async I/O, Squid mostly scales well to the
number of CPUs, issuing several disk I/O operations simultaneously &
asynchronously, so it can proceed to execute the main loop without
waiting for I/O completion? In that case that part of the FAQ needs
updating, I guess.