Search squid archive

Re: url rewrite and cache, which URL should be cached?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Henrik,

Henrik Nordstrom a écrit :
On tis, 2007-10-09 at 17:47 +0200, Sylvain Viart wrote:
Hi,

I use a redirector on an accel proxy config.

url_rewrite_program /etc/squid/redirector.pl
url_rewrite_children 15
url_rewrite_concurrency 0
url_rewrite_host_header off


It seems, that the url used to store the requested url is the orginal url, not the rewrited on.

The cache is using the rewritten URL.
here is some more detail:

the redirector script:
#--------------------------8< ------------------------------------
#!/usr/bin/perl
#
# request_URI    client_IP/FQDN    username HTTP_method
#
# url_rewrite_program
# URL <SP> client_ip "/" fqdn <SP> user <SP> method <SP> urlgroup <NL>
#
#
# access.log
# 1190984604.736 6 12.34.56.78 TCP_MISS/200 1752 GET http://proxy-03.mydomain.com/thumb/100/default_woman.jpg - ROUNDROBI
# N_PARENT/php-03 image/jpeg
#
#
# !perl -n -e ' m@(http://[^ ]+)@; print "$1\n";' < /var/log/squid/access.log
$|=1;

$filer = 'filer-01';
@filer_domain = qw /img.mydomain.com/;

$filer_host = join('|', @filer_domain);

# dumy domain name for canonical rewriting
$php = 'static-php';

$n = 0;
while (<>)
{
if(m[^http://$filer_host/] || m[^http://([^/]+)/(js/static_file|media|thumb)])
       {
               #$host = &get_filer;
               s#http://[^/]+(:[0-9]+)?#http://$filer#;

               # add urlgroup
               s/^/!filer! /;
       }
       else
       {
               if($_ !~ /\.php/)
               {
                       s#http://[^/]+(:[0-9]+)?#http://$php#;
               }

               # add urlgroup
               s/^/!php! /;
       }

       print;
}
#--------------------------8< ------------------------------------

the redirector returns:
echo "http://mes-test2.mydomain.com/js/mailbox.js"; | perl t.pl
!php! http://static-php/js/mailbox.js

echo "http://sometestagain.mydomain.com/js/mailbox.js"; | perl t.pl
!php! http://static-php/js/mailbox.js

some store.log entries:

1191938682.998 SWAPOUT 00 000002E5 2AC12C498B97741871A11F0290E927C8 200 1191938682 1180514999 -1 application/x-javascript
418/418 GET http://mes-test.mydomain.com/js/mailbox.js
1191938722.433 SWAPOUT 00 000002E7 C61344FEBB15FC2C7D039A36A2EE552D 200 1191938722 1180514999 -1 application/x-javascript
418/418 GET http://mes-test2.mydomain.com/js/mailbox.js

for me it should be stored under
http://static-php/js/mailbox.js

associated acl and urlgroup:

# urlgroup matching acl from url_rewrite_program
acl static_doc urlgroup filer
acl php_doc urlgroup php

# filer server access rules
cache_peer_access filer-01 allow static_doc
cache_peer_access filer-01 deny all

# php server access exclusion for static_doc matched on the filer
#cache_peer_access php-01 deny static_doc
#cache_peer_access php-02 deny static_doc
#cache_peer_access php-03 deny static_doc
cache_peer_access php-04 deny static_doc

Could it be associated with the urlgroup which somewhat hides the rewriting?

Sequence is approximately

* Request accepted
* http_access Access controls
* URL rewriting, replacing Squid's idea of the URL
* http_access2 Access controls
* Cache lookup
* Forwarding on cache miss
* http_reply_access Reply access controls


Because of this using "url_rewrite_host_header off" can be a very bad
thing as it makes the requested URL sent to the web server differ from
the cache URL, and can easily bite you..
it seems it bites, :-)

but that what I want, it worked with squid2.5 redirector without urlgroup.
* cache canonized URL
* peer: original URL.

would be simpler if it works like that for my config

More documentation on "url_rewrite_host_header off"?

Regards,
Sylvain.



[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux