Re: squidaio_queue_request: WARNING - Queue congestion

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Wed, 25 Aug 2010 18:02:20 +1200

Mike Diggins wrote:

OS: RHEL 5.3 64 Bit
Squid: squid-2.7.STABLE9

Decided to do a very light load test on a new Squid server (specs 
above). I'm running the AUFS file system

    cache_dir aufs /usr/local/squid/var/cache 20000 16 256

The test ran for about an hour with about 10 clients loading random web 
pages. The CPU/ WAIT IO never got above 2%, so I was surprised to see a 
few of these warnings in cache.log:

Squid spends a huge amount of its time processing the HTTP headers to 
figure out what needs doing. Disk IO is only a very small and often 
short portion of the time for most requests.

    squidaio_queue_request: WARNING - Queue congestion

Saw about 5 or 6 of these over the hour but they seemed to slow down (or 
stop) as time went on. I did some reading and found this:

    Queue congestion starts appearing if there are 8 operations queued
    and not yet handled by the IO threads. The limit is then doubled
    (to 16) before the next warning appears. And so on...

Does this mean the warning limits are doubled each time? If so, that 
would seem to just mask the problem, no? I'm not sure if I have a 
problem or not.

Summary: 5-6 seems okay to me. Slowing and stopping is even better.

Details:

Yes it keeps doubling. The queue being mentioned is the buffer between 
NIC bandwidth and disk bandwidth.

It is self-regulating. There is a client waiting for each operation to 
complete and if they are delayed in the queue the client sees longer 
loading times and is slower to fetch the next object. On top of that, 
the warning gets shown at the time when increasing RAM buffers to cope 
with any extra load going into the future.

With slow disks or fast NIC plus some few MB of client traffic it can be 
expected to overflow and be expanded automatically a few times. Also 
seen shortly after startup if doing a DIRTY cache rebuild or at the 
buildup to peak network traffic while Squid learns what sort of queue 
buffer is needed to cope.

It is only a problem if it continues and does not go away after these 
events when the disk(s) should start catching up again. That type of 
behaviour means the disk is too slow for the *average* client traffic 
load being pushed through it. Multiple disks or proxies can help if that 
is the case.

I don't have it easily at hand, but there is a formula for disk 
bandwidth you can calculate from the RPMs. (4KB per RPM or something). 
That is a very rough estimate of how much traffic is flowing to/from the 
disk per second before this warning shows up the first time, even a 
small sustained increase above that causes the queue to grow a few times.

Large amounts of extra AIO threads can appear to help. But do so by 
having more buffer space in RAM used by all the threads. In reality its 
better to have only fine tuning and multiple disks.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.7
  Beta testers wanted for 3.2.0.1