Search squid archive

I would like to use Squid for caching but it is imperative that all files be cached.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



First I will explain what I am trying to do.

I have a number of tests (executables and scripts) which run on resources downloaded via HTTP, FTP etc. Some of these tests are third party compiled executables which would be problematic to change. The resources can potentially be any type of file and have different file extensions. Some URLs for these files have query strings. Tests can download resources in any order, there is no way to tell which test will download any given file first. I have no control at all over the resources tested. The tests run on a server which is used for nothing else but running these tests (no human web browsing). It is imperative that all tests are run on identical files for each URL. If the file changes the tests will be inconsistent.

Therefore it is imperative that all files be cached regardless of anything. I would like to use Squid for this caching. The only things that should not be cached are HTTP response codes Internal Error 500, Service temporarily overloaded 502 and suchlike, where it is better to have some tests run rather than none in the case of a temporary server error. I guess it would be to much to ask to be able to cache over HTTPS.

I have tried to configure Squid 2.7.Stable9 to achieve caching of everything regardless. These are the changes made to the default configuration file supplied with the default distribution on Ubuntu Server 10.10:

< # http_access deny all

> http_access allow all


< # hierarchy_stoplist cgi-bin ?

> hierarchy_stoplist never_direct


< refresh_pattern ^ftp: 1440 100% 10080

< refresh_pattern ^gopher: 1440 100% 1440

< #refresh_pattern -i (/cgi-bin/|\?) 0 0% 0

< #refresh_pattern (Release|Package(.gz)*)$ 0 20% 2880

< # example line deb packages

< #refresh_pattern (\.deb|\.udeb)$ 129600 100% 129600

< refresh_pattern . 1440 100% 4320

> refresh_pattern .* 1440 100% 4320 ignore-no-cache ignore-private ignore-auth override-expire reload-into-ims


With these settings and a fully primed cache I still get entries like this in my access.log file:

1303398515.769 120 192.168.1.8 TCP_MISS/200 7174 GET http://domain.com/resource.php - DIRECT/83.223.106.8 text/html

1303398524.140 80 192.168.1.8 TCP_MISS/200 521 HEAD http://domain.com/resource.php - DIRECT/83.223.106.8 text/html

1303398524.536 118 192.168.1.8 TCP_MISS/200 7174 GET http://domain.com/resource.php - DIRECT/83.223.106.8 text/html

1303398532.671 118 192.168.1.8 TCP_MISS/200 7174 GET http://domain.com/resource.php - DIRECT/83.223.106.8 text/html


Also even with an URL containing a “?” (even for an HTML file which otherwise caches) I get :

1303398589.824 98 192.168.1.8 TCP_MISS/200 440 HEAD http://domain.com/resource.html? - DIRECT/83.223.106.8 text/html

1303398590.117 141 192.168.1.8 TCP_MISS/200 2665 GET http://domain.com/resource.html? - DIRECT/83.223.106.8 text/html


Can anybody advise if it is possible to achieve what I intend with changes to configuration only or can somebody point my to a starting point where I can change Squid source code?



[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux