Search squid archive

Re: Multiple squid cache instances throttled by website

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28.02.2012 10:36, Francis Fauteux wrote:
We are using a collection of Squid instances (version 2.7.stable9) as
caching proxies behind a single gateway IP, processing requests from a
large number of users.


By gateway IP do you mean a NAT gateway? or each Squid instance setting its outgoing IP to the same value?

If this is a NAT gateway could it be simple NAT table pair/tuplet limitations?


We've observed that a number of websites throttle our usage when
requests targeting a given domain are processed by two or more
instances. This does not occur when all requests to this domain are
processed by a single instance.

We are trying to find the root cause for this behaviour, and the fact
that it does not occur with a single Squid instance may help us
diagnose. From the origin server's perspective, only two changes are
visible between using a single instance and using two or more:

* The value injected in the 'Via' header differs between Squid
instances. The web server may not expect requests coming from a single
IP to contain different values for the HTTP 'Via' header. This is
something we can investigate ourselves, but input would be welcome.

If the web server is in fact doing such checks it is in violation of HTTP specification. HTTP is message-based in the same model as TCP is packet-based. Which route the message/packet took is mostly irrelevant, although they cold be checking it for security access that should not have side effects like this. Your multiple instances could even share the same packet connection and expect it to work (er, Squid does pipelining multiplexing).

There is no way the server can rely on a specific one of these chaining scenarios:
  client->A->server
  client->B->server
  client->A->B->server


and speaking of those scenarios, it seems more likely to me that third scenario is happening to you. Each layer of proxying adds latency, so messages doing the A->B hop could appear slower (throttled?) than when its not present. The CARP design is specifically tuned to make such multi-hop layering efficient, but generic peer clusters doing it can slow things down.


* If each Squid limits the number of connections to a given server,
using several instances may cause the origin server to see a number of
connections which exceeds what they expect to see from a single IP.
This is the question for this forum: does Squid actually limit the
number of per-server connections? Is this number configurable (either
in squid.conf or by rebuilding)?

The default is not to limit. You can configure a limit on clients if you wish.

If this is relevant it would be in the form of a client connection limit at the server end.


Note that each affected website resolves to a single IP; Squid
instances are not receiving different IPs from DNS servers.


Amos


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux