Search squid archive

Re: Request processing question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Apr 6, 2008, at 4:59 AM, Henrik Nordstrom wrote:
lör 2008-04-05 klockan 23:26 -0400 skrev David Lawson:
I've got a couple questions about how Squid chooses to fulfill a
request.  Basically, I've got a cache with a number of sibling peers
defined.  Some of the time it makes an ICP query to those peers and
then does everything it should do, takes the first hit, makes the HTTP request for the object via that peer, etc. Some, perhaps most, of the
time, it doesn't even make an ICP query for the object, it just goes
direct to the origin server.

The primary distinction is hierarchical/nonhierarchical requests.
Siblings is only queried on hierarchical requests.

non-hierarchical:
 - reload requests
 - cache validations if you have non-Squid ICP peers
 - non-GET/HEAD/TRACE requests
 - authenticated requests
 - matching hierarchy_stoplist

Hmmm, okay, that was more or less the assumption I was working under, but the behavior I'm seeing doesn't seem to match that. One of my coworkers did a packet capture of two requests, one of which resulted in an ICP query, the other of which bypassed the ICP query process entirely and went direct to the origin.

ICP:

   GET http://www.foo.com:8881/towns/baz/x1151547945 HTTP/1.0\r\n
       Request Method: GET
       Request URI: http://www.foo.com:8881/towns/baz/x1151547945
       Request Version: HTTP/1.0
   Host: www.foo.com:8881\r\n
   Accept: text/html,text/plain,application/*\r\n
   From: user@xxxxxxxxxx\r\n
   User-Agent: gsa-crawler (Enterprise; GIX-01642; user@xxxxxxxxxx)\r\n
   Accept-Encoding: gzip\r\n
   If-Modified-Since: Sun, 16 Mar 2008 22:22:39 GMT\r\n
   Via: 1.0 cache2.ghm.zope.net:80 (squid/2.5.STABLE12)\r\n
   X-Forwarded-For: 64.233.190.112\r\n
   Cache-Control: max-age=86400\r\n
   \r\n

Non-ICP:

Hypertext Transfer Protocol
   GET http://www.bar.com:8881/baz/news/rss HTTP/1.0\r\n
       Request Method: GET
       Request URI: http://www.bar.com:8881/baz/news/rss
       Request Version: HTTP/1.0
   Host: www.wickedlocal.com:8881\r\n
User-Agent: Yahoo-Newscrawler/3.9 (news-search-crawler at yahoo- inc dot com)\r\n
   Via: 1.0 cache4.ghm.zope.net:80 (squid/2.5.STABLE12)\r\n
   X-Forwarded-For: 69.147.86.154\r\n
   Cache-Control: max-age=86400\r\n
   \r\n

Any ideas about why those requests were processed differently?

I've also got a broader, more general question of how a request flows
through the Squid process, when ACLs are processed, are they before or
after any rewriter is done to the URLs, etc., but that's a really
secondary thing, right now I'm just concerned with the ICP question.

Depends on which access directive you look at. Generally speaking
http_access is before url rewrites, the rest after.


Ah, okay.  Thanks Henrik, I appreciate the info.

--Dave
Systems Administrator
Zope Corp.
540-361-1722
david@xxxxxxxx





[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux