Search squid archive

Re: Caching http google deb files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Amos,

I have implemented your patch at

and added following to my squid.conf
archive_mode allow all

and my refresh pattern is,
refresh_pattern dl-ssl.google.com/.*\.(deb|zip|tar|rpm) 129600 100% 129600 ignore-reload ignore-no-store override-expire override-lastmod ignor$

but i am still not able to cache it, can you tell from below output what would be the problem ? Do i need to configure anything extra ?

here is the debug output for the same,
------------------------------------------------------------------------------------------------

2016/10/05 15:46:25.319 kid1| 5,2| TcpAcceptor.cc(220) doAccept: New connection on FD 14
2016/10/05 15:46:25.319 kid1| 5,2| TcpAcceptor.cc(295) acceptNext: connection on local=[::]:3128 remote=[::] FD 14 flags=9
2016/10/05 15:46:25.319 kid1| 11,2| client_side.cc(2346) parseHttpRequest: HTTP Client local=192.168.1.1:3128 remote=192.168.1.76:51236 FD 12 flags=1
2016/10/05 15:46:25.319 kid1| 11,2| client_side.cc(2347) parseHttpRequest: HTTP Client REQUEST:
---------
GET http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb HTTP/1.1
Host: dl-ssl.google.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: NID=88=109tS20j8Ec0EQb5HzuNnbwtsl4sK64aakVRn-2qOe91Zv4e3st9lfyik8qQe7d12J4xBDCmdKMwiXY98a2dj4mOitaP4AbJV6fD7o9YKTxE7MziEkNCJ45GiDszPM8wXca5cuYK_gE4QVrU52VqzSa1IzmHbh_7XKsvYuDCSsgIMZaC8d4Fp01vrAU8dHPXGopVpBIxgpHwAjPv8NvLFM3e4y-um5A8umQ-GCFmpaaLd1_1jyafkNLTj-9Ix4hfsw; SID=1ANPj1-lw03bKfunZfrmk8ZsjEcTl5AiLgwzgtzki8MZ3JuvGyYgiP7LRJ05U1HQWbf76g.; HSID=AUu5M-p2Rw1uDb2_0; APISID=ss4uEw9eIOgmsZXv/ARs9Vws4Es_o_sfVX
Connection: keep-alive
Upgrade-Insecure-Requests: 1


----------
2016/10/05 15:46:25.320 kid1| 85,2| client_side_request.cc(744) clientAccessCheckDone: The request GET http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb is ALLOWED; last ACL checked: CONNECT
2016/10/05 15:46:25.320 kid1| 85,2| client_side_request.cc(720) clientAccessCheck2: No adapted_http_access configuration. default: ALLOW
2016/10/05 15:46:25.320 kid1| 85,2| client_side_request.cc(744) clientAccessCheckDone: The request GET http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb is ALLOWED; last ACL checked: CONNECT
2016/10/05 15:46:25.320 kid1| 17,2| FwdState.cc(133) FwdState: Forwarding client request local=192.168.1.1:3128 remote=192.168.1.76:51236 FD 12 flags=1, url="" href="http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb">http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb
2016/10/05 15:46:25.320 kid1| 44,2| peer_select.cc(258) peerSelectDnsPaths: Find IP destination for: http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb' via dl-ssl.google.com
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(280) peerSelectDnsPaths: Found sources for 'http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb'
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(281) peerSelectDnsPaths:   always_direct = ALLOWED
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(282) peerSelectDnsPaths:    never_direct = DENIED
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:          DIRECT = local=[::] remote=[2404:6800:4008:c02::be]:80 flags=1
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:          DIRECT = local=0.0.0.0 remote=74.125.23.136:80 flags=1
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:          DIRECT = local=0.0.0.0 remote=74.125.23.93:80 flags=1
2016/10/05 15:46:25.417 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:          DIRECT = local=0.0.0.0 remote=74.125.23.91:80 flags=1
2016/10/05 15:46:25.418 kid1| 44,2| peer_select.cc(286) peerSelectDnsPaths:          DIRECT = local=0.0.0.0 remote=74.125.23.190:80 flags=1
2016/10/05 15:46:25.418 kid1| 44,2| peer_select.cc(295) peerSelectDnsPaths:        timedout = 0
2016/10/05 15:46:25.418 kid1| 14,2| ipcache.cc(924) ipcacheMarkBadAddr: ipcacheMarkBadAddr: dl-ssl.google.com [2404:6800:4008:c02::be]:80
2016/10/05 15:46:25.567 kid1| 11,2| http.cc(2203) sendRequest: HTTP Server local=192.168.1.1:36674 remote=74.125.23.136:80 FD 13 flags=1
2016/10/05 15:46:25.567 kid1| 11,2| http.cc(2204) sendRequest: HTTP Server REQUEST:
---------
GET /dl/linux/direct/mod-pagespeed-beta_current_i386.deb HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:49.0) Gecko/20100101 Firefox/49.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: NID=88=109tS20j8Ec0EQb5HzuNnbwtsl4sK64aakVRn-2qOe91Zv4e3st9lfyik8qQe7d12J4xBDCmdKMwiXY98a2dj4mOitaP4AbJV6fD7o9YKTxE7MziEkNCJ45GiDszPM8wXca5cuYK_gE4QVrU52VqzSa1IzmHbh_7XKsvYuDCSsgIMZaC8d4Fp01vrAU8dHPXGopVpBIxgpHwAjPv8NvLFM3e4y-um5A8umQ-GCFmpaaLd1_1jyafkNLTj-9Ix4hfsw; SID=1ANPj1-lw03bKfunZfrmk8ZsjEcTl5AiLgwzgtzki8MZ3JuvGyYgiP7LRJ05U1HQWbf76g.; HSID=AUu5M-p2Rw1uDb2_0; APISID=ss4uEw9eIOgmsZXv/ARs9Vws4Es_o_sfVX
Host: dl-ssl.google.com
Cache-Control: max-age=7776000
Connection: keep-alive


----------
2016/10/05 15:46:25.780 kid1| ctx: enter level  0: 'http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb'
2016/10/05 15:46:25.780 kid1| 11,2| http.cc(717) processReplyHeader: HTTP Server local=192.168.1.1:36674 remote=74.125.23.136:80 FD 13 flags=1
2016/10/05 15:46:25.780 kid1| 11,2| http.cc(718) processReplyHeader: HTTP Server REPLY:
---------
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 6662208
Content-Type: application/x-debian-package
Etag: "fa383"
Last-Modified: Thu, 15 Sep 2016 19:24:00 GMT
Server: downloads
Vary: *
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
Date: Wed, 05 Oct 2016 10:16:25 GMT

!<arch>
debian-binary   1473872866  0     0     100644  4         `
2.0
control.tar.gz  1473872866  0     0     100644  7806      `
----------
2016/10/05 15:46:25.780 kid1| ctx: exit level  0
2016/10/05 15:46:25.780 kid1| 20,2| store.cc(949) checkCachable: StoreEntry::checkCachable: NO: not cachable
2016/10/05 15:46:25.780 kid1| 20,2| store.cc(949) checkCachable: StoreEntry::checkCachable: NO: not cachable
2016/10/05 15:46:25.781 kid1| 88,2| client_side_reply.cc(2005) processReplyAccessResult: The reply for GET http://dl-ssl.google.com/dl/linux/direct/mod-pagespeed-beta_current_i386.deb is ALLOWED, because it matched all
2016/10/05 15:46:25.781 kid1| 11,2| client_side.cc(1392) sendStartOfMessage: HTTP Client local=192.168.1.1:3128 remote=192.168.1.76:51236 FD 12 flags=1
2016/10/05 15:46:25.781 kid1| 11,2| client_side.cc(1393) sendStartOfMessage: HTTP Client REPLY:
---------
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 6662208
Content-Type: application/x-debian-package
ETag: "fa383"
Last-Modified: Thu, 15 Sep 2016 19:24:00 GMT
Server: downloads
Vary: *
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
Date: Wed, 05 Oct 2016 10:16:25 GMT
Connection: keep-alive


----------
2016/10/05 15:46:25.781 kid1| 20,2| store.cc(949) checkCachable: StoreEntry::checkCachable: NO: not cachable
2016/10/05 15:46:25.781 kid1| 20,2| store.cc(949) checkCachable: StoreEntry::checkCachable: NO: not cachable
2016/10/05 15:46:25.781 kid1| 20,2| store.cc(949) checkCachable: StoreEntry::checkCachable: NO: not cachable
2016/10/05 15:46:25.782 kid1| 20,2| store.cc(949) checkCachable: StoreEntry::checkCachable: NO: not cachable



On Tue, Oct 4, 2016 at 8:00 PM, Hardik Dangar <hardikdangar+squid@xxxxxxxxx> wrote:
Wow, i couldn't think about that. google might need tracking data that could be the reason they have blindly put vary * header. oh Irony, company which talks to all of us on how to deliver content is trying to do such thing.

I have looked at your patch but how do i enable that ? do i need to write custom ACL ? i know i need to compile and reinstall after applying patch but what do i need to do exactly in squid.conf file as looking at your patch i am guessing i need to write archive acl or i am too naive to understand C code :)

Also 

reply_header_replace is any good for this ?


On Tue, Oct 4, 2016 at 7:47 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
On 5/10/2016 2:34 a.m., Hardik Dangar wrote:
> Hey Amos,
>
> We have about 50 clients which downloads same google chrome update every 2
> or 3 days means 2.4 gb. although response says vary but requested file is
> same and all is downloaded via apt update.
>
> Is there any option just like ignore-no-store? I know i am asking for too
> much but it seems very silly on google's part that they are sending very
> header at a place where they shouldn't as no matter how you access those
> url's you are only going to get those deb files.


Some things G does only make sense whan you ignore all the PR about
wanting to make the web more efficient and consider it's a company whose
income is derived by recording data about peoples habits and activities.
Caching can hide that info from them.

>
> can i hack squid source code to ignore very header ?
>

Google are explicitly saying the response changes. I suspect there is
something involving Google account data being embeded in some of the
downloads. For tracking, etc.


If you are wanting to test it I have added a patch to
<http://bugs.squid-cache.org/show_bug.cgi?id=4604> that should implement
archival of responses where the ACLs match. It is completely untested by
me beyond building, so YMMV.

Amos



_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux