Search squid archive

Re: Squid4 has extremely low hit ratio due to lacks of ignore-no-cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Yuri,

What have you tried until now to understand the situation of the issue?
From your basic question I was sure that you ran some tests on some well defined objects. To asses the state of squid you would need some static objects and some changing objects. You would also be required to test using a simple UA like a script\wget\curl and using a fully fledged UA such as chrome\firefox\explorer\opera\safari.

I can try to give you couple cache friendly sites which will show you what the basic status of squid.
In the past someone here asked how to cache the site: http://djmaza.info/
which by default is built using lots of static content such as html pages, pictures and good css and also implements cache headers nicely. I still do not know why wordpress main pages do not have a valid cache headers, and what I mean by that is: Is it not possible to have a page cached for 60 seconds? how much updates can accrue in 60 seconds? would a "must-revalidate" cost that much compared to what there is now?(maybe it does). Another one would be https://www.yahoo.com/ which is mostly fine with cache and define their main page with "Cache-Control: no-store, no-cache, private, max-age=0" since it's a news page which updates too much.

I was looking for cache testing subjects and I am still missing couple examples\options.

Until now what have mainly used for basic test was redbot.org and a local copy of it for testing purposes. And while writing to you looking for test subjects I have seen that my own web server have a question for squid. I have an example case which gives a great example how to "break" squid cache.
So I have my website and the main page:
http://ngtech.co.il/
https://ngtech.co.il/ (self signed certificate)

that would be cached properly by squid!
If for any reason I would remove the Last-Modified header from the page it would become un-cachable in squid (3.5.10) with default settings. Accidentally I turned ON the apache option to treat html as php using the apache configuration line:
AddType application/x-httpd-php .html .htm

Which is recommended to be used by the first google result for "html as php".
Once you will try the next page(with this setting on):
http://ngtech.co.il/squidblocker/
https://ngtech.co.il/squidblocker/ (self signed certificate)

You will see that it is not being cached at all.
Redbot.org claims that the cache is allowed it's own freshness for the object but squid (3.5.10) will not cache it what so ever and no matter what I do. When I am removing "http as php" tweak the page response with a Last-Modified header and can be cached again.

I am unsure who is the culprit for the issue but I will ask about it in a separated thread *if* I will get no response here.(sorry for partially top-posting)

Eliezer


On 25/10/2015 21:29, Yuri Voinov wrote:
In a nutshell - I need no possible explanation. I want to know - it's a
bug or so conceived?

26.10.15 1:17, Eliezer Croitoru пишет:
Hey Yuri,

I am not sure if you think that Squid version 4 with extreme low hit
ratio is bad or not but I can understand your sight about things.
Usually I am redirecting to this page:
http://wiki.squid-cache.org/Features/StoreID/CollisionRisks#Several_real_world_examples

But this time I can proudly say that the squid project is doing things
the right way while it might not be understood by some.
Before you or anyone declares that there is a low hit ratio due to
something that is missing I will try to put some sense into how things
looks in the real world.
Small thing from a nice day of mine:
I was sitting talking with a friend of mine, a MD to be exact and
while we were talking I was just comforting him about the wonders of
Computers.
He was complaining on how the software in the office moves so slow and
he needs to wait for the software to response with results. So I
hesitated a bit but then I asked him "What would have happen if some MD
here in the office will receive the wrong content\results on a patient
from the software? he described it to me terrified from the question 'He
can get the wrong decision!' and then I described to him how he is in
such a good place when he doesn't need to fear from such scenarios.
In this same office Squid is being used for many things and it's
crucial that besides the option to cache content the possibility to
validate cache properly will be set right.

I do understand that there is a need for caches and sometimes it is
crucial in order to give the application more CPU cycles or more RAM but
sometimes the hunger for cache can consume the actual requirement for
the content integrity and it must be re-validated from time to time.

I have seen couple times how a cache in a DB or other levels results
with a very bad and unwanted result while I do understand some of the
complexity and caution that the programmers take when building all sort
of systems with cache in them.

If you do want to understand more about the subject pick your favorite
scripting language and just try to implement a simple object caching.
You would then see how complex the task can be and you can maybe then
understand why caches are not such a simple thing and specially why
ignore-no-cache should not be used in any environment if it is possible.

While I do advise you to not use it I would hint you and others on
another approach to the subject.
If you are greedy and you have hunger for cache for specific
sites\traffic and you would like to be able to benefit from over-caching
there is a solution for that!
- You can alter\hack squid code to meet your needs
- You can write an ICAP service that will be able to alter the
response headers so squid would think it is cachable by default.
- You can write an ECAP module that will be able to alter the response
headers ...
- Write your own cache service with your algorithms in it.

Take in account that the squid project tries to be as fault tolerance
as possible due to it being a very sensitive piece of software in very
big production systems.
Squid doesn't try to meet the requirement of "Maximum Cache" and it is
not squid that as a caching proxy makes a reduction of any cache percentage!
The reason that the content is not cachable is due to all these
application that describe their content as not cachable!
For a second of sanity from the the squid project, try to contact
google\youtube admins\support\operators\forces\what-ever to understand
how would you be able to benefit from a local cache.
If and when you do manage to contact them let them know I was looking
for a contact and I never managed to find one of these available to me
on the phone or email. You cannot say anything like that on the squid
project, the squid project can be contacted using an email and if
required you can get a hold of the man behind the software(while he is a
human).

And I will try to write it in a geeky way:
deny_info 302:https://support.google.com/youtube/
big_system_that_doesnt_want_to_be_cached

Eliezer

* P.S If you do want to write an ICAP service or an ECAP module to
replace the "ignore-no-cache" I can give you some code that will might
help you as a starter.


On 25/10/2015 17:17, Yuri Voinov wrote:

Hi gents,

Pay attention to whether someone from the test SQUID 4 as extremely low
of cache hits from the new version? Particularly with respect to sites
HTTPS directive "no cache"? After replacing the Squid 3.4 to 4 squid
cache hit collapsed from 85 percent or more on the level of 5-15
percent. I believe this is due to the exclusion of support guidelines
ignore-no-cache, which eliminates the possibility of aggressive caching
and reduces the value of caching proxy to almost zero.

This HTTP caches normally. However, due to the widespread use of HTTPS
trends - caching dramatically decreased to unacceptable levels.

Noticed there anyone else this effect? And what is now with caching?


_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux