Mike Diggins wrote:
OS: RHEL 5.3 64 Bit
Squid: squid-2.7.STABLE9
Decided to do a very light load test on a new Squid server (specs
above). I'm running the AUFS file system
cache_dir aufs /usr/local/squid/var/cache 20000 16 256
The test ran for about an hour with about 10 clients loading random web
pages. The CPU/ WAIT IO never got above 2%, so I was surprised to see a
few of these warnings in cache.log:
Squid spends a huge amount of its time processing the HTTP headers to
figure out what needs doing. Disk IO is only a very small and often
short portion of the time for most requests.
squidaio_queue_request: WARNING - Queue congestion
Saw about 5 or 6 of these over the hour but they seemed to slow down (or
stop) as time went on. I did some reading and found this:
Queue congestion starts appearing if there are 8 operations queued
and not yet handled by the IO threads. The limit is then doubled
(to 16) before the next warning appears. And so on...
Does this mean the warning limits are doubled each time? If so, that
would seem to just mask the problem, no? I'm not sure if I have a
problem or not.
Summary: 5-6 seems okay to me. Slowing and stopping is even better.
Details:
Yes it keeps doubling. The queue being mentioned is the buffer between
NIC bandwidth and disk bandwidth.
It is self-regulating. There is a client waiting for each operation to
complete and if they are delayed in the queue the client sees longer
loading times and is slower to fetch the next object. On top of that,
the warning gets shown at the time when increasing RAM buffers to cope
with any extra load going into the future.
With slow disks or fast NIC plus some few MB of client traffic it can be
expected to overflow and be expanded automatically a few times. Also
seen shortly after startup if doing a DIRTY cache rebuild or at the
buildup to peak network traffic while Squid learns what sort of queue
buffer is needed to cope.
It is only a problem if it continues and does not go away after these
events when the disk(s) should start catching up again. That type of
behaviour means the disk is too slow for the *average* client traffic
load being pushed through it. Multiple disks or proxies can help if that
is the case.
I don't have it easily at hand, but there is a formula for disk
bandwidth you can calculate from the RPMs. (4KB per RPM or something).
That is a very rough estimate of how much traffic is flowing to/from the
disk per second before this warning shows up the first time, even a
small sustained increase above that causes the queue to grow a few times.
Large amounts of extra AIO threads can appear to help. But do so by
having more buffer space in RAM used by all the threads. In reality its
better to have only fine tuning and multiple disks.
Amos
--
Please be using
Current Stable Squid 2.7.STABLE9 or 3.1.7
Beta testers wanted for 3.2.0.1