Search squid archive

Re: Accelerating proxy not matching cgi files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/08/11 21:38, Mateusz Buc wrote:
2011/8/24 Amos Jeffries<squid3@xxxxxxxxxxxxx>:

Maybe. We would need to see the HTTP headers produced by gen.cgi to be sure.
 From the description of how index.cgi/gen.cgi interact I think it highly
likely the lack of Cache-Control and Last-Modified information from gen.cgi
is causing the cache algorithms to determine its unsafe to store.


I gained access to the code of gen.cgi and made few changes:

         printf("Cache-Control: max-age=600, s-maxage=300\n");
         printf("Last-Modified: %s\n",mdate);

It now fetches timestamp from the URL, parses it to appropriate format
and then outputs as Last-Modified header. Plus I added Cache-Control.
Results are noticable - now I get most of TCP_REFRESH_UNMODIFIED/304
on my test page (gen.cgi links don't change there, so all timestamps
remain the same all the time).

Thank you a lot for these suggestions!

However, I still can't make these URLs/images cached on my squid. Is
there any chance they can be served directly from squid cache when
they do not change? Right now I have reduced network bandwidth
obviously, but not sure about CPU load - it still takes almost the
same time to load URL (about 8 seconds).

Halfway there. Stage 1 complete after a fashion.

Meaning of "TCP_REFRESH_UNMODIFIED/304" :
 - TCP_ = TCP transport used
 - REFRESH = If-Modified-Since sent to origin (aka gen.cgi)
- UNMODIFIED = full object came back. Headers +body apparently identical to the known cached copy. - /304 = converted to a 304 "no change" response for the client half of the transaction.

The 304 portion going across client<->Squid is where you are getting *all* the bandwidth savings right now.

As I said earlier:


At this point incoming requests will either be requesting brand new content or
have an If-Modified-Since: header containing the cached objects Last-Modified: timestamp.

 NOTE: You will not _yet_ see any reduction in the 200 requests. Potentially you might
actually see an increase as "must-revalidate" causes middleware caches to start working better.

The difference you are seeing to what I predicted is caused by your use of max-age instead of must-revalidate.

max-age allows the browsers to cache the graphs for 600 seconds. So you will get _zero_ repeat traffic for that duration. The exact opposite of what must-revalidate will do for you. On top of that you cannot see Squid serving HIT requests because of s-maxage. Its set at 300 so Squid will expire before the browser cache does. When the browser _does_ request an IMS request the Squid copy has already expired and forces a contact to gen.cgi to check for updates.

Okay fine, use max-age and s-maxage. To get HITs under the current circumstances set s-maxage larger than max-age. Or omit it and have Squid cache the same length as any browser. Its shared by all clients, so you will get some, but not a lot more.


Do you have any further tips?


Just this: Keep going.

You are roughly up to the end of Step 1 of my earlier instructions. Step 2 is where the CPU benefits start appearing.

Every time gen.cgi can decide If-Modified-Since is newer than graph data. It saves all the graph production CPU time AND the graph size worth of bandwidth.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.14
  Beta testers wanted for 3.2.0.10


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux