Re: caching dynamic image content

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Fri, 14 Aug 2009 11:36:21 +1200

Terry wrote:
On Thu, Aug 13, 2009 at 9:32 AM, Terry<td3201@xxxxxxxxx> wrote:
On Thu, Aug 13, 2009 at 8:10 AM, Amos Jeffries<squid3@xxxxxxxxxxxxx> wrote:
Terry wrote:
Hello,

I don't have squid implemented yet.

I am researching a web architecture issue I am seeing with a site.
Squid may be a bandaid for what I think may be some poor development
architecture decisions.  There are concerns that the site is written
in a way that browsers and reverse proxies cannot cache it
appropriately. And these aren't my concerns by the way.  We also have
A10 load balancers in house that do some caching.  They said they
can't cache this content.  I don't want to go into their reasoning
because I don't believe it.

Here's an example of an image as seen from the client.  I pulled this
right out of my firefox memory cache:
http://foo.domain.com/Image.aspx?i=db1edbcd-2375-4bae-b33f-a53ced60deed

1. If it's in the memory cache, can I assume that browsers and proxies
can cache it?  Also, I never saw these objects in my disk cache.  Not
sure if that's significant or not.
No. The browser has additional information such as who is logged in and
whether your session with the website is the same. They are also allowed to
cache objects personal to you.

Proxies and caches only have the URL and some other limited data to base the
checking on. If there is any chance it was a private object it will not be
cached naturally.

2.  Does firefox still interpret this as an image and cache it as one
or is this considered dynamic content that may be problematic?
Not enough information to even guess. What headers are present? Does the
website require login? does the same image ever change URL (including the
query string) and why/when/how often? are alternative image formats
available at the exact same URL?

Any one of those answers may make the object non-cacheable by shared
proxies.

I think that's enough information to start a conversation.  Thanks for
any insight!
 foo.domain.com does not resolve here so I can't verify the object.
Please pick some of the URLs and enter them into http://www.redbot.org for
review of cacheability.

Amos
--
Please be using
 Current Stable Squid 2.7.STABLE6 or 3.0.STABLE18
 Current Beta Squid 3.1.0.13

Thank you both for replying.  I haven't messed with squid and caching
for 5+ years and its all slightly coming back to me.  The identifier
in the URL is not unique based upon the session of the user.
https://foo.domain.com/Image.aspx?i=db1edbcd-2375-4bae-b33f-a53ced60deed

the i=db1edbcd-2375-4bae-b33f-a53ced60deed is a unique identifier for
the image and its size.    Based on that, it should be cacheable but
the developers are setting it to nocache for some reason.  I am
guessing they reused some code for other dynamic content and failed to
see this aspect.

Just to further prove its not being cached, here's the header:

GET /Pic.aspx?i=db1edbcd-2375-4bae-b33f-a53ced60deed HTTP/1.1
Host: foo.domain.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: https://foo.domain.com/
Cookie: com.domain.foo_SessionID=kua4ew454kjodsjpdlojdu55
Cache-Control: max-age=0

The reply headers are kind of more important for the reply is what gets 
stored.

Um, just in case it has not already been done...
 check that the QUERY ACL and its "cache deny" rule are removed from 
your squid.conf and "refresh_pattern -i (/cgi-bin/|\?) 0 0% 0" is added 
on the line above the dot '.' refresh pattern.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE18
  Current Beta Squid 3.1.0.13