On Apr 6, 2008, at 4:59 AM, Henrik Nordstrom wrote:
lör 2008-04-05 klockan 23:26 -0400 skrev David Lawson:
I've got a couple questions about how Squid chooses to fulfill a
request. Basically, I've got a cache with a number of sibling peers
defined. Some of the time it makes an ICP query to those peers and
then does everything it should do, takes the first hit, makes the
HTTP
request for the object via that peer, etc. Some, perhaps most, of
the
time, it doesn't even make an ICP query for the object, it just goes
direct to the origin server.
The primary distinction is hierarchical/nonhierarchical requests.
Siblings is only queried on hierarchical requests.
non-hierarchical:
- reload requests
- cache validations if you have non-Squid ICP peers
- non-GET/HEAD/TRACE requests
- authenticated requests
- matching hierarchy_stoplist
Hmmm, okay, that was more or less the assumption I was working under,
but the behavior I'm seeing doesn't seem to match that. One of my
coworkers did a packet capture of two requests, one of which resulted
in an ICP query, the other of which bypassed the ICP query process
entirely and went direct to the origin.
ICP:
GET http://www.foo.com:8881/towns/baz/x1151547945 HTTP/1.0\r\n
Request Method: GET
Request URI: http://www.foo.com:8881/towns/baz/x1151547945
Request Version: HTTP/1.0
Host: www.foo.com:8881\r\n
Accept: text/html,text/plain,application/*\r\n
From: user@xxxxxxxxxx\r\n
User-Agent: gsa-crawler (Enterprise; GIX-01642; user@xxxxxxxxxx)\r\n
Accept-Encoding: gzip\r\n
If-Modified-Since: Sun, 16 Mar 2008 22:22:39 GMT\r\n
Via: 1.0 cache2.ghm.zope.net:80 (squid/2.5.STABLE12)\r\n
X-Forwarded-For: 64.233.190.112\r\n
Cache-Control: max-age=86400\r\n
\r\n
Non-ICP:
Hypertext Transfer Protocol
GET http://www.bar.com:8881/baz/news/rss HTTP/1.0\r\n
Request Method: GET
Request URI: http://www.bar.com:8881/baz/news/rss
Request Version: HTTP/1.0
Host: www.wickedlocal.com:8881\r\n
User-Agent: Yahoo-Newscrawler/3.9 (news-search-crawler at yahoo-
inc dot com)\r\n
Via: 1.0 cache4.ghm.zope.net:80 (squid/2.5.STABLE12)\r\n
X-Forwarded-For: 69.147.86.154\r\n
Cache-Control: max-age=86400\r\n
\r\n
Any ideas about why those requests were processed differently?
I've also got a broader, more general question of how a request flows
through the Squid process, when ACLs are processed, are they before
or
after any rewriter is done to the URLs, etc., but that's a really
secondary thing, right now I'm just concerned with the ICP question.
Depends on which access directive you look at. Generally speaking
http_access is before url rewrites, the rest after.
Ah, okay. Thanks Henrik, I appreciate the info.
--Dave
Systems Administrator
Zope Corp.
540-361-1722
david@xxxxxxxx