On 12.09.2012 10:54, Saurabh Sheth wrote:
Squid (versions: 3.1 and 2.6) has a object in its cache and responds to individual requests to this object just fine (TCP_HIT:NONE). From the access.log -> 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 41136 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 24752 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 28848 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 41136 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 24752 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 45232 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 28848 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 49328 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 49328 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 32944 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 37040 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET http://originserver/data/object HTTP/1.1" 200 37040 TCP_HIT:NONE However, when I make a huge number of concurrent requests for the same object, squid fails to load the object from the disk fast enough and gives a TCP_SWAPFAIL_MISS -> 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 53424 TCP_HIT:NONE 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 37031 TCP_SWAPFAIL_MISS:DIRECT 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 28839 TCP_MISS:NONE 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 32935 TCP_MISS:NONE All subsequent requests hit the origin server directly causing huge load on the origin server (TCP_MISS:NONE) -> 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 32935 TCP_MISS:NONE 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 28839 TCP_MISS:NONE 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 37031 TCP_MISS:NONE 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 37031 TCP_MISS:NONE 10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET http://originserver/data/object HTTP/1.1" 200 32935 TCP_MISS:NONE This is undesirable in the production setup, since such huge number of requests hitting the origin server directly have the result of a DOS attack on the origin server. This has brought down our origin server more than once now.
Well, when you think about it this is a DOS on Squid as well. The backend server is only facing the overflow which squid can't erase fast enough. So any attacker trying this has to pass *two* DOS thresholds, first the squid one then the backend on top. There is always another idiot infected PC, so DOS resolution is not about *solving* the traffic problem, but raising the bar and reducing the impact/damage when it happens.
I am looking for any help or pointers on how can I deal with such a huge number of concurrent requests to squid for the same object effectively, any help is highly appreciated. I am already considering the option of rate limiting using iptables, however if there is a effective way to deal with this in the squid configuration itself; I would love to understand.
You were a bit vague about which specific release versions of Squid you have. 2.6 should have had collapsed forwarding feature which acts as a great DOS barrier. It has not been ported to squid-3 yet, but efficiencies have been improved in the cache handling so you could try the latest 2.7 or 3.2 releases and see if this raises the bar high enough for you.
Amos