On 3/08/2013 10:45 p.m., x-man wrote:
Hi Amos,
I think this request time is the time needed to serve the entire request.
It is.
How is icp_query_timeout related to that, it should be only about the query
through ICP protocol?
When determining which sibling peer can be used for the HTTP fetch ICP
is one of the methods of lookup to check the peer has the content.
It needs to be waited on for before making the decision on that peer.
Two such peer lookups with your timeout would account for up to 18000 ms
of that request service time.
It is unclear from both the log and the code whether the TIMEOUT_ part
is coming from the peer you got connected to or some earlier attempted
peer. I suspect that it is coming from some earlier attempt to identify
a peer.
Otherwise we are using our own cache peer which is dealing with the youtube
content, which supports ICP protocol, it's connected to squid as cache peer
and the squid (based on ACL) is sending youtube requests to the cache peer.
I'm comparing squid 3.1.9 and 3.3.8 and what I notice is that without
changing any other element of the system
with icp_query_timeout 9000 set for both test cases,
with squid 3.1 I don't get any TIMEOUT_FIRSTUP_PARENT in the access.log, and
with squid 3.3.8 I'm getting lot's of them and this is reducing our
performance.
Please suggest what can be the difference and what I can check further.
The big difference in this area between those two versions is that we
reorganised the sequence of operations request forwarding does to
include DNS lookups for the possible outgoing routes. The result is that
peers are now guaranteed to only get tried once each and each of their
IPs will be tried only once each as well, with tcp_outgoing_address
working properly regardless of the IP addressing method.
It is possible that both proxies ICP queries are getting timed out, but
simply that the 3.1 is locating a "usable" peer fast enough to have
already moved past that step to the DNS which was done separately
before. With the DNS queries now within that peer selection stage the
3.3 could be delayed long enough for the ICP results to get marked on
the transaction. Meaning that success/failure for both versions was
unchanged just the log slightly more accurate.
It is also possible that the improved HTTP/1.1 support placing a larger
request load on the parent proxy or network traffic (via traffic moving
faster). Not much can be done about that.
To solve this I think there are two easy ways you can go forward:
1) figureing out why the peer needs the timout at all an fixing that
problem (could be CPU bound at the peer, network traffic congestion, or
excessive buffering in the network)
2) If this is a setup simply for youtube caching by the parent peer
proxy then I suggest you take a look at moving to 3.4 series amd taking
advantage of the Store-ID feature there which is an improved version of
the Store-URL feature 2.7 provided.
Amos