Search squid archive

Re: Strange performance effects on squid during off peak hours

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16/09/10 01:01, Martin Sperl wrote:
Hi everyone,

we are seeing a strange response-time effect over 24 hours when delivering content via Squid+icap service (3.0.STABLE9 - I know old, but getting something changed in a production environment can be VERY hard...). Icap server we use is rewriting some URLs and also rewriting some of the content response.

Essentially we see that during peak hours the Average response time is better than during off-peak hours.
Here a report for one day for all CSS files that are delivered with CacheStatus TCP_MEM_HIT (as taken from the extended access-logs of squid) for a single server (all servers show similar effects):

Here the quick overview:
+------+------+-------+
| hour | hits | ART   |
+------+------+-------+
|    0 | 4232 | 0.016 |
|    1 | 4553 | 0.015 |
|    2 | 4238 | 0.015 |
|    3 | 4026 | 0.018 |
|    4 | 1270 | 0.024 |
|    5 |  390 | 0.042 |
|    6 |   61 | 0.054 |
|    7 |  591 | 0.034 |
|    8 |  445 | 0.038 |
|    9 |  505 | 0.035 |
|   10 |  716 | 0.034 |
|   11 | 1307 | 0.030 |
|   12 | 2552 | 0.023 |
|   13 | 3197 | 0.021 |
|   14 | 3567 | 0.020 |
|   15 | 4095 | 0.019 |
|   16 | 4037 | 0.019 |
|   17 | 4670 | 0.017 |
|   18 | 5349 | 0.016 |
|   19 | 5638 | 0.017 |
|   20 | 6262 | 0.014 |
|   21 | 5634 | 0.014 |
|   22 | 4809 | 0.016 |
|   23 | 5393 | 0.016 |
+------+------+-------+
<snip>
You can see that for off-peak hours (6am UTC 91% of all request with TCP_MEM_HIT for css files are>0.030 seconds).
As for "peak" hours most requests are responded at 0.011s and 0.001s (@18:00 with 5.5% of all requests).

I know, that the numbers reported by squid also include some "effects" of the network itself.
But we also see similar effects on active monitoring of html+image downloads within our Span of control (this is one of our KPIs, which we are exceeding during graveyard-shift hours...).

We have tried a lot of things:
* virtualized versus real HW (0.002s improvement during peak hours)
* removing diskcache (uses the default settings compiled into squid when no diskcache is defined - at least the version of squid that we have)
* moving diskcache to ramdisk and increasing it (this has a negative effect!!!) - I wanted to change to aufs, but the binary we have does not support it..
* tuning some linux kernel parameters for increasing TCP buffers

Has someone experienced similar behavior and has got any recommendations what else we can do/test (besides upgrading to squid 3.1, which is a major effort from the testing perspective and which may not resolve the issue either)?


Squid is still largely IO event driven. If the network IO is less than say 3-4 req/sec Squid can have a queue of things waiting to happen which get delayed a long time (hundreds of ms) waiting to be kicked off.
 Your overview seems to show that behaviour clearly.

There have been some small improvements and fixes to several of the lagging things but I think its still there in even the latest Squid.


With the knowledge that it only happens under very low loads and self-corrects as soon as traffic picks up; is it still a problem? if so you may want to contact The Measurement Factory and see if they have anything to help for 3.0.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.8
  Beta testers wanted for 3.2.0.2


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux