Search squid archive

AW: Mixing cached and non-cached access of same URLs by session-id

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi amos,

thanks for your advise so far. I am still not sure wich path to follow...


>> We are using squid as a reverse-proxy cache to speed up our website.
>> A large area of the website is public. But there is also a
>> personalized area. If a user logs into his personal site, we maintain
>> a session for the user (using standard tomcat features jsession-id
>> cookie with optional url-rewriting).

>> [...] the pages on the public area has a small caveat: If the user
>> was logged in the private area, we maintain the "logged-in" state and
>> reflect that state on public pages also (outputting "Welcome John
>> Doe" in a small box). Of course we must not cache these pages.

>> # Recognizes mysite acl MYSITE url_regex ^http://[^.]*\.mysite\.de
>> 
>> # Don't cache pages, if user sends or gets a cookie
>> acl JSESSIONID1 req_header Cookie -i  jsessionid
>> cache deny MYSITE JSESSIONID1
>> 
>> acl JSESSIONID2 rep_header Set-Cookie -i jsessionid
>> cache deny MYSITE JSESSIONID2

>> This seemed to wor fine. Until i did a jmeter test, mixing Requests
>> with and without sessionid cookies. Is seems that if i request an
>> already cached url with a session-cookie, that the cached document is
>> flushed.  


>[...]

>Of course if Squid find that it has a cached copy it will erase. Because 
>the _UR_ is not to be cached. Content is not considered.

>This is NOT the right way to do privacy caching. See below for why and 
>how to do it.

[...]

> The biggest surprise of all is still hiding unseen by you:

> Every other cache around the Internet visitors use maybe storing the 
> private area pages!!

> This is because you use a local configuration completely internal to 
> your Squid to determine what is cacheable and what is not.

> The correct way to do this is to:

>  * have the web server which generates the pages add a header 
> ("Cache-Control: private") to all pages which are in the private area of 
> the website. This tells every shared cache (your Squid included) not to 
> store the private info.

I agree with that. Do i have to configure the reverse-proxy *explicitely*
to avoid caching "Cache-Control: private" marked pages?

A problem i foresee with that solution is, if i set "Cache-Control: 
private" for pages  containing personalized content, they will bounce 
cached pages with the same URL - but without personalized content 
(rember: the page is rendered different, depending on wether the 
user is in a session.)

>  * have the personal adjustments to the public pages done as small 
> includes so that the main body and content of the page can be cached 
> normally, but the small modifications are not.
> For example I like including a small CSS/AJAX script which changes a 
> generic HTML div [..]

I have thought of that, too. But i would prefer not to touch 
the application.

> The HTTP way to achieve similar is to add "ETag:" header with some hash 
> of the page content in it. So each unique copy of the page is stored 
> separately. The personalized pages get "Cache-Control: private" added as 
> well so that whole request get discarded.

That sounds interesting... Are the following assumptions correct:

The ETag would be generated by the webserver. A public page (/index.jsp) 
would have _one_ ETag if rendered without and a different unique ETag for 
each request  (to the same /index.jsp) with a session-cookie. The cache 
for the publicly cached page would be left untouched, if the response 
bears a "Cache-Control: private" header but with a different ETag. That 
implies, the cache is flushed when the webserver responds, not when the 
client requests. 

Does the Etag have to be unique resource-wide, or is it also possible
to use the same ETag for different resources (since they have
different URLs)?

Is it another "very bad idea (tm)" to reuse the same ETag for each 
personalized page. I would assume, it doesn't matter since they are
marked "private" anyway?

> Some details indicate "Vary:" header for this, but basing it on the 
> cookie header with a session ID inside is another very bad idea that 
> will destroy your HIT rates.

> Amos


Achim


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux