Search squid archive

Re: Peering caches (squid and 3rd parties) - How to

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/11/2013 11:24 PM, Guillermo Javier Nardoni - Grupo GERYON wrote:
Hello everyone,

We have this situation and we tried a lot of configurations without success.

• 1000 Customers
• 4 Caches BOX running Squid 2.7 on Debian Squeeze • Caches are full-meshed
to each other • Every Squid is running in transparent mode (http_port 3128
transparent) • Every Squid is running HAARPCACHE on localhost at port 8080
(HAARPCACHE is a Thundercache 3.1 fork wich Works PERFECT for caching sites
like youtube with lots of HITS) .
• Every Squid is connected to Internet through RB1 • RB2 (Mikrotik RouterOS)
is doing round-robin selection on every squid redirecting all trafic to port
80 to internet to port 3128 on squid

cat /etc/haarp/haarp.lst
root@cpe-58-1-26-172:/etc/haarp# cat /etc/haarp/haarp.lst
http.*\.4shared\.com.*(\.exe|\.iso|\.torrent|\.zip|\.rar|\.pdf|\.doc|\.tar|\
.mp3|\.mp4|\.avi|\.wmv)
http.*\.avast\.com.*(\.def|\.vpu|\.vpaa|\.stamp)
http.*(\.avg\.com|\.grisoft\.com|\.grisoft\.cz).*(\.bin|\.exe)
http.*(\.avgate\.com|\.avgate\.net|\.freeav\.net|\.freeav\.com).*(\.gz)
http.*\.bitgravity\.com.*(\.flv\.mp4)
http.*\.etrustdownloads\.ca\.com.*(\.tar|\.zip|\.exe|\.pkg)
http.*flashvideo\.globo\.com.*(\.mp4|\.flv)
http.{1,4}vsh\.r7\.com\/.*(\.mp4)$
74\.125\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[
0-9][0-9]?)
#http.*\.googlevideo\.com.*videoplayback
#http.*fpatch\.grandchase\.com\.br.*(\.kom|\.mkom|\.mp3)
http.*(\.kaspersky-labs\.com|\.geo\.kaspersky\.com|kasperskyusa\.com).*(\.av
c|\.kdc|\.klz|\.bz2|\.dat|\.dif)
#http.*\.mccont\.com.*\.flv
http.*\.metacafe\.com.*\.flv
http.{1,4}media\w*\.justin.tv\/archives\/(\w|\/|-)*\.flv(\?.*|$)
http.{1,4}\w*juegos\w*\.juegosdiarios\.com\/(\w|\/|-)*\.swf$
http.{1,4}\w*\.juegosjuegos\.com\/games(\w|\/|-)*\.swf$
##http.*(\.windowsupdate\.com|(\.microsoft\.com)).*(\.cab|\.exe|\.iso|\.zip|
\.psf)
http.*(\.windowsupdate\.com|(update|download|dlservice|windowsupdate)\.micro
soft\.com)\/.*(\.cab|\.exe|\.iso|\.zip|\.psf|\.txt|\.crt)$
http.*\.pornotube\.com.*\.flv
http.*\.terra\.com.*\.flv
#http.*uol\.com\.br.*\.flv
http.*\.viddler\.com.*\.flv
#http.*\.video\.msn\.com.*\.flv
http.*(porn|img).*\.xvideos\.com\/videos\/(thumbs\/)?.*(\.jpg|\.flv\?.*|\.mp
4\?.*)$
http.*\.youtube\.com.*videoplayback\?
http.*\.ziddu\.com.*(\.exe|\.iso|\.torrent|\.zip|\.rar|\.pdf|\.doc|\.tar|\.m
p3|\.mp4|\.avi|\.wmv)
http.*edgecastcdn\.net/.*(\.mp4|\.flv)
http.*adobe\.com/.*(\.cab|\.aup|\.exe|\.msi|\.upd|\.msp)
http.*\.eset\.com.*\.nup
http.*\.nai\.com.*(\.zip|\.tar|\.exe|\.gem)
http.*\.pop6\.com.*(\.flv)
http.*\.symantecliveupdate\.com.*(\.zip|\.exe)
#http.*\.xpg\.com\.br.*
http.{1,4}\w*\.ytimg\.com.*(hqdefault(\.jpg|\.mp4)$|M[0-9]+\.jpg\?sigh=)
http.{1,4}\w*google(\.\w|\w)*\.doubleclick\.net\/pagead\/ads\?.*
http.*img[0-9]\.submanga\.com\/(hd)?pages\/.*(\.jpg|\.webp)
http.*(profile|s?photos|video).{0,5}\.ak\.fbcdn\.net\/.*(\.mp4\?.*|\_[a-z]\.
jpg$|\.mp4$|\_[a-z]\.png$)
#http.*(profile|s?photos|video).{0,5}\.ak\.fbcdn\.net\/.*(\.mp4\?.*|\_n\.jpg
$|\.mp4$|\_n\.png$)
http.*\.video\.pornhub\.\w*\.com\/videos\/.*\.flv\?.*
http.*\.(publicvideo|publicphoto)\.xtube\.com\/(videowall\/)?videos?\/.*(\.f
lv\?.*|\_Thumb\.flv$)
http.*public\.tube8\.com\/.*\.mp4.*
http.*videos\..*\.redtubefiles\.com\/.*\.flv
(205\.196\.|199\.91\.)[0-9]{2,3}\.[0-9]{1,3}\/.*
#http.*\.rapidshare\.com\/cgi-bin\/.*\.cgi\?.*sub=download
http.*\.vimeo.com\/.*\.mp4(\?.*)?$
http.*images\.orkut\.com\/orkut\/photos\/.*\.jpg$
http.{1,4}(\w|\/|\.|-)*media\.tumblr\.com\/(\w|\/|-|\.)*tumblr(\w|\/|-)*(\.p
ng|\.jpg)$
#http.{1,7}speedtest(\w|-)*(\.|\w)+\/speedtest\/(random.*\.jpg|latency\.txt)
\?.*
#http.{1,10}testdevelocidad.{1,5}\/speedtest\/(random.*\.jpg|latency\.txt)\?
.*
#http.{1,7}(\.|[a-z]|[0-9]|-)+(\/\w+)?(\/speedtest)+\/(random[0-9]+x[0-9]+\.
jpg|latency\.txt)

As you can well see!, youtube and many others sites is cachings its content
through HAARPCACHE and not by squid itself. BTW It Works GREAT.

Configuration on every squid.conf at /etc/squid

Proxy1:
IP: 192.168.1.1

cache_peer       192.168.2.1         sibling   3128      3130
proxy-only cache_peer       192.168.3.1         sibling   3128
3130      proxy-only cache_peer       192.168.4.1         sibling
3128      3130      proxy-only

acl haarp_lst url_regex -i "/etc/haarp/haarp.lst"
cache deny haarp_lst
cache_peer 127.0.0.1 parent 8080 0 proxy-only no-digest dead_peer_timeout 2
seconds cache_peer_access 127.0.0.1 allow haarp_lst cache_peer_access
127.0.0.1 deny all


Proxy2:
IP: 192.168.2.1

cache_peer       192.168.1.1         sibling   3128      3130
proxy-only cache_peer       192.168.3.1         sibling   3128
3130      proxy-only cache_peer       192.168.4.1         sibling
3128      3130      proxy-only

acl haarp_lst url_regex -i "/etc/haarp/haarp.lst"
cache deny haarp_lst
cache_peer 127.0.0.1 parent 8080 0 proxy-only no-digest dead_peer_timeout 2
seconds cache_peer_access 127.0.0.1 allow haarp_lst cache_peer_access
127.0.0.1 deny all

Proxy3:
IP: 192.168.3.1

cache_peer       192.168.2.1         sibling   3128      3130
proxy-only cache_peer       192.168.1.1         sibling   3128
3130      proxy-only cache_peer       192.168.4.1         sibling
3128      3130      proxy-only

acl haarp_lst url_regex -i "/etc/haarp/haarp.lst"
cache deny haarp_lst
cache_peer 127.0.0.1 parent 8080 0 proxy-only no-digest dead_peer_timeout 2
seconds cache_peer_access 127.0.0.1 allow haarp_lst cache_peer_access
127.0.0.1 deny all

Proxy4:
IP: 192.168.4.1

cache_peer       192.168.2.1         sibling   3128      3130
proxy-only cache_peer       192.168.3.1         sibling   3128
3130      proxy-only cache_peer       192.168.1.1         sibling
3128      3130      proxy-only

acl haarp_lst url_regex -i "/etc/haarp/haarp.lst"
cache deny haarp_lst
cache_peer 127.0.0.1 parent 8080 0 proxy-only no-digest dead_peer_timeout 2
seconds cache_peer_access 127.0.0.1 allow haarp_lst cache_peer_access
127.0.0.1 deny all




Everything “Works” fine when you browse sites but those who MUST go through
HAARPCACHE PEER don’t.
Let’s picture that.

Client 1 ask for http://www.youtube.com/watch?v=juqyzgnbspY . RB2, through
its round-robin selection, redirect this petition to Proxy1

Proxy1 accept this connection and according to “acl haarp_lst”, it goes to
cache_peer 127.0.0.1.

Proxy1 -> Peer 127.0.0.1:
• Is http://www.youtube.com/watch?v=juqyzgnbspY on local cache?.
o If  YES: RETURN HIT with the FILE
o If NO: RETURN MISS, download the file from Internet, save it to disk and
serve the file.



Client 80 ask for http://www.youtube.com/watch?v=juqyzgnbspY . RB2, through
its round-robin selection, redirect this petition to Proxy3

Proxy3 accept this connection and according to “acl haarp_lst”, it goes to
cache_peer 127.0.0.1.

Proxy3 -> Peer 127.0.0.1:
• Is http://www.youtube.com/watch?v=juqyzgnbspY on local cache?.
o If  YES: RETURN HIT with the FILE
o If NO: RETURN MISS, download the file from Internet, save it to disk and
serve the file.


As you can see, the same file is downloaded twice (at least) if the petition
is not redirected to the same cache box.
How can I achieve the goal to ask every cache and if the file is cached on
any sibling or parent it shouldn’t be downloaded from internet but the cache
itself.

Note 1: I can run HAARPCACHE on 0.0.0.0/0 if this is a solution.

The schematic bellow shows how are connected clients, caches and routers.
http://picpaste.com/njTPeEBb.jpg


Please forgive my errors writing in english.

Nice to know about your setup..
This cache you have should support ICP or HTCP and allow squid to find out if the file is in cache. I think it's not possible right now because of the dynamic links youtube has and the low support of these caches in hierarchy protocols.
This might not be the case but it's a good direction.

If you can try squid newest version from HEAD that has StoreID in it you might find it very powerfull in your situation. There is a small "bug" which when StoreID is being used the proxy asks from the sibling only a StoreID url in the ICP requests. If you do ask me I think that it should work this way in your setup but in a setup when you have parent proxy it should send the original request.

Do you want to try this feature which will reduce the need for an upper layer cache proxy??

If you do I will be happy to guide you and make sure the setup will work very good.

Regards,
Eliezer






[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux