On 25/08/11 21:38, Mateusz Buc wrote:
2011/8/24 Amos Jeffries<squid3@xxxxxxxxxxxxx>:
Maybe. We would need to see the HTTP headers produced by gen.cgi to be sure.
From the description of how index.cgi/gen.cgi interact I think it highly
likely the lack of Cache-Control and Last-Modified information from gen.cgi
is causing the cache algorithms to determine its unsafe to store.
I gained access to the code of gen.cgi and made few changes:
printf("Cache-Control: max-age=600, s-maxage=300\n");
printf("Last-Modified: %s\n",mdate);
It now fetches timestamp from the URL, parses it to appropriate format
and then outputs as Last-Modified header. Plus I added Cache-Control.
Results are noticable - now I get most of TCP_REFRESH_UNMODIFIED/304
on my test page (gen.cgi links don't change there, so all timestamps
remain the same all the time).
Thank you a lot for these suggestions!
However, I still can't make these URLs/images cached on my squid. Is
there any chance they can be served directly from squid cache when
they do not change? Right now I have reduced network bandwidth
obviously, but not sure about CPU load - it still takes almost the
same time to load URL (about 8 seconds).
Halfway there. Stage 1 complete after a fashion.
Meaning of "TCP_REFRESH_UNMODIFIED/304" :
- TCP_ = TCP transport used
- REFRESH = If-Modified-Since sent to origin (aka gen.cgi)
- UNMODIFIED = full object came back. Headers +body apparently
identical to the known cached copy.
- /304 = converted to a 304 "no change" response for the client half
of the transaction.
The 304 portion going across client<->Squid is where you are getting
*all* the bandwidth savings right now.
As I said earlier:
At this point incoming requests will either be requesting brand new content or
have an If-Modified-Since: header containing the cached objects Last-Modified: timestamp.
NOTE: You will not _yet_ see any reduction in the 200 requests. Potentially you might
actually see an increase as "must-revalidate" causes middleware caches to start working better.
The difference you are seeing to what I predicted is caused by your use
of max-age instead of must-revalidate.
max-age allows the browsers to cache the graphs for 600 seconds. So
you will get _zero_ repeat traffic for that duration. The exact opposite
of what must-revalidate will do for you.
On top of that you cannot see Squid serving HIT requests because of
s-maxage. Its set at 300 so Squid will expire before the browser cache
does. When the browser _does_ request an IMS request the Squid copy has
already expired and forces a contact to gen.cgi to check for updates.
Okay fine, use max-age and s-maxage. To get HITs under the current
circumstances set s-maxage larger than max-age. Or omit it and have
Squid cache the same length as any browser. Its shared by all clients,
so you will get some, but not a lot more.
Do you have any further tips?
Just this: Keep going.
You are roughly up to the end of Step 1 of my earlier instructions.
Step 2 is where the CPU benefits start appearing.
Every time gen.cgi can decide If-Modified-Since is newer than graph
data. It saves all the graph production CPU time AND the graph size
worth of bandwidth.
Amos
--
Please be using
Current Stable Squid 2.7.STABLE9 or 3.1.14
Beta testers wanted for 3.2.0.10