On 24/08/11 06:51, Mateusz Buc wrote:
2011/8/23 Amos Jeffries:
You have an internal part of your site performing GET requests?
Or did you mean it generates an index page containing a set of volatile URLs
for IMG or A tags?
Sorry, it was misunderstanding. index.cgi contains set of volatile
URLs, such as the gen.cgi link example I provided earlier.
Okay.
Ouch. Bad, bad, bad for caching. Caches only works when the URLs are stable
with repeated calls to the same ones.
I forgot to mention that gen.cgi URLs only change, when the log data
changes, so sometimes it could be once for a 15 minutes I suppose. It
I kind of assumed that was the case. All my previous suggestions still
apply unchanged.
This extra info only means that you can probably succeed with simple
the ETag calculations. Which leads to some rather nice effects.
all depends on the information index.cgi gets from data-gathering
software. It changes timestamps or other info in URL only when log
data changes.
That is where I have a big conceptual problem:
gen.cgi is accessing and using the data. Yet does not have any way to
identify when it last changed?
If you can make gen.cgi have that info index.cgi knowledge becomes
irrelevant.
Today I made an expirement and analyzed apache server
logs. It turned out that about 40% of all gen.cgi links were repeated
at least once - it means that they were downloaded at least twice. So
I guess they should be cached at least for a while, shouldn't they?
Maybe. We would need to see the HTTP headers produced by gen.cgi to be
sure. From the description of how index.cgi/gen.cgi interact I think it
highly likely the lack of Cache-Control and Last-Modified information
from gen.cgi is causing the cache algorithms to determine its unsafe to
store.
Amos
--
Please be using
Current Stable Squid 2.7.STABLE9 or 3.1.14
Beta testers wanted for 3.2.0.10