Search squid archive

Re: Not all html objects are being cached

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





27.01.2017 9:10, Amos Jeffries пишет:
On 27/01/2017 9:46 a.m., Yuri Voinov wrote:

27.01.2017 2:44, Matus UHLAR - fantomas пишет:
26.01.2017 2:22, boruc пишет:
After a little bit of analyzing requests and responses with WireShark I
noticed that many sites that weren't cached had different
combination of
below parameters:

Cache-Control: no-cache, no-store, must-revalidate, post-check,
pre-check,
private, public, max-age, public
Pragma: no-cache
On 26.01.17 02:44, Yuri Voinov wrote:
If the webmaster has done this - he had good reason to. Trying to break
the RFC in this way, you break the Internet.
Actually, no. If the webmaster has done the above - he has no damn
idea what
those mean (private and public?) , and how to provide properly cacheable
content.
It was sarcasm.

You may have intended it to be. But you spoke the simple truth.

Other than 'public' there really are situations which have "good reason"
to send that set of controls all at once.

For example; any admin who wants a RESTful or SaaS application to
actually work for all their potential customers.


I have been watching the below cycle take place for the past 20 years in
HTTP:

Webmaster: dont cache this please.

   "Cache-Control: no-store"

Proxy Admin: ignore-no-store


Webmaster: I meant it. Dont deliver anything you cached without fetching
a updated version.

   ... "no-store, no-cache"

Proxy Admin: ignore-no-cache


Webmaster: really you MUST revalidate before using ths data.

  ... "no-store, no-cache, must-revalidate"

Proxy Admin: ignore-must-revalidate


Webmaster: Really I meant it. This is non-storable PRIVATE DATA!

... "no-store, no-cache, must-revalidate, private"

Proxy Admin: ignore-private


Webmaster: Seriously. I'm changing it on EVERY request! dont store it.

... "no-store, no-cache, must-revalidate, private, max-age=0"
"Expires: -1"

Proxy Admin: ignore-expires


Webmaster: are you one of those dumb HTTP/1.0 proxies who dont
understand Cache-Control?

"Pragma: no-cache"
"Expires: 1 Jan 1970"

Proxy Admin: hehe! I already ignore-no-cache ignore-expires


Webmaster: F*U!  May your clients batch up their traffic to slam you
with it all at once!

... "no-store, no-cache, must-revalidate, private, max-age=0,
pre-check=1, post-check=1"


Proxy Admin: My bandwidth! I need to cache more!

Webmaster: Doh! Oh well, so I have to write my application to force new
content then.

Proxy Admin: ignore-reload


Webmaster: Now What? Oh HTTPS wont have any damn proxies in the way....

... the cycle repeats again within HTTPS. Took all of 5 years this time.

... the cycle repeats again within SPDY. That took only ~1 year.

... the cycle repeats again within CoAP. The standards are not even
finished yet and its underway.


Stop this cycle of stupidity. It really HAS "broken the Internet".
All that would be just great if a webmaster was conscientious. I will give just one example.

Only one example.

root @ khorne /patch # wget -S http://www.microsoft.com
--2017-01-27 15:29:54--  http://www.microsoft.com/
Connecting to 127.0.0.1:3128... connected.
Proxy request sent, awaiting response...
  HTTP/1.1 302 Found
  Server: AkamaiGHost
  Content-Length: 0
  Location: http://www.microsoft.com/ru-kz/
  Date: Fri, 27 Jan 2017 09:29:54 GMT
  X-CCC: NL
  X-CID: 2
  X-Cache: MISS from khorne
  X-Cache-Lookup: MISS from khorne:3128
  Connection: keep-alive
Location: http://www.microsoft.com/ru-kz/ [following]
--2017-01-27 15:29:54--  http://www.microsoft.com/ru-kz/
Reusing existing connection to 127.0.0.1:3128.
Proxy request sent, awaiting response...
  HTTP/1.1 301 Moved Permanently
  Server: AkamaiGHost
  Content-Length: 0
  Location: https://www.microsoft.com/ru-kz/
  Date: Fri, 27 Jan 2017 09:29:54 GMT
Set-Cookie: akacd_OneRF=1493285394~rv=7~id=6a2316770abdbb58a85c16676a0f84fd; path=/; Expires=Thu, 27 Apr 2017 09:29:54 GMT
  X-CCC: NL
  X-CID: 2
  X-Cache: MISS from khorne
  X-Cache-Lookup: MISS from khorne:3128
  Connection: keep-alive
Location: https://www.microsoft.com/ru-kz/ [following]
--2017-01-27 15:29:54--  https://www.microsoft.com/ru-kz/
Connecting to 127.0.0.1:3128... connected.
Proxy request sent, awaiting response...
  HTTP/1.1 200 OK
  Cache-Control: no-cache, no-store
  Pragma: no-cache
  Content-Type: text/html
  Expires: -1
  Server: Microsoft-IIS/8.0
  CorrelationVector: BzssVwiBIUaXqyOh.1.1
  X-AspNet-Version: 4.0.30319
  X-Powered-By: ASP.NET
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
  Access-Control-Allow-Methods: GET, POST, PUT, DELETE, OPTIONS
  Access-Control-Allow-Credentials: true
P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI"
  X-Frame-Options: SAMEORIGIN
  Vary: Accept-Encoding
  Content-Encoding: gzip
  Date: Fri, 27 Jan 2017 09:29:56 GMT
  Content-Length: 13322
Set-Cookie: MS-CV=BzssVwiBIUaXqyOh.1; domain=.microsoft.com; expires=Sat, 28-Jan-2017 09:29:56 GMT; path=/ Set-Cookie: MS-CV=BzssVwiBIUaXqyOh.2; domain=.microsoft.com; expires=Sat, 28-Jan-2017 09:29:56 GMT; path=/
  Strict-Transport-Security: max-age=0; includeSubDomains
  X-CCC: NL
  X-CID: 2
  X-Cache: MISS from khorne
  X-Cache-Lookup: MISS from khorne:3128
  Connection: keep-alive
Length: 13322 (13K) [text/html]
Saving to: 'index.html'

index.html          100%[==================>]  13.01K --.-KB/s    in 0s

2017-01-27 15:29:57 (32.2 MB/s) - 'index.html' saved [13322/13322]

Can you explain me - for what static index.html has this:

Cache-Control: no-cache, no-store
Pragma: no-cache

?

What can be broken to ignore CC in this page?


Yes, saving traffic is the most important, because not all and not everywhere has terabit links with unlimited calling. Moreover, the number of users increases and the capacity is finite. In any case, the decision on how to deal with the content in such a situation should remain behind the proxy administrator. And not for the developers of this proxy, which is hardcoded own vision, even with RFC. Because the byte-hit 10% (vanilla Squid, after very hadr work it will be up to 30%, but no more) - this is ridiculous. In such a situation would be more honest nothing at all cache - only let's not say that the squid - a caching proxy. Set the path of the secondary server that requires a lot of attention, despite the fact that it gives a gain only 10% - a mockery of users.

Let me explain the situation as I see it. Webmaster hanging everywhere ban caching in any way possible, because on its pages full of advertising. For that pays money. This is the same reason that Google prevents caching Youtube. Big money. We do not get the money, in fact, our goal - to minimize the costs of traffic. We choose Squid as a tool. And you, with your point of view, deprived us of weapons against unscrupulous webmasters. So it looks.

Again. Breaking the Internet - it should be my choice, not yours. Or follow the RFC at 100% - or do not have to break it in part. You either wear pants or remove the cross, as they say.


HTH
Amos
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux