Search squid archive

Re: How to use tcp_outgoing_address with cache_peer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 26/04/2013 8:37 p.m., Alex Domoradov wrote:
First of all - thanks for your help.

Problem #1: Please upgrade your Squid.

  Squid-2.6 has been 3 years since the last security update, nearly 5 years
since your particular version was superceded.
ok, I will update to the latest version

On 24/04/2013 12:15 a.m., Alex Domoradov wrote:
Hello all, I encountered the problem with configuration 2 squids. I
have the following scheme -

http://i.piccy.info/i7/0ecd5cb8276b78975a791c0e5f55ae60/4-57-1543/57409208/squids_schema.jpg

Problem #2: Please read the section on how RAID0 interacts with Squid ...
http://wiki.squid-cache.org/SquidFaq/RAID

Also, since youa re using SSD, see #1. The older Squid like 2.6 push
*everything* through disk which reduces your SSD lifetime a lot. Please
upgrade to a current release (3.2 or 3.3 today) which try to avoid disk a
lot more in general and offer cache types like rock for even better I/O
savings on small responses.
ok. The main reason why I choose raid0 is to get necessary disk space ~400 Gb.

It does not work the way you seem to think. 2x 200GB cache_dir entries have just as much space as 1x 400GB. Using two cache_dir allows Squid to balance teh I/O loading on teh disks while simultaenously removing all processing overheads from RAID.

acl AMAZON dstdom_regex -i (.*)s3\.amazonaws\.com
cache_peer_access 192.168.220.2 allow AMAZON

acl RACKSPACE dstdom_regex -i (.*)rackcdn\.com
cache_peer_access 192.168.220.2 allow RACKSPACE

FYI: these dstdom_regex look like they can be far more efficiently replaced
by dstdomain ACLs and even combined into one ACL name.
Did you mean something like

acl RACKSPACE dstdomain .rackcdn.com
cache_peer_access 192.168.220.2 allow RACKSPACE

Would be any REALLY speed improvements? I know that regexp is more "heavy"

Yes I meant exactly that.

And yes it does run faster. The regex is required to do a forward-first byte-wise pattern match with extra complexity around whether what its up to is part of the wildcard .* at the front or not. dstdomain does a reverse string comparison. In this case never more than 3 block-wise comparisons, usually only 2. It comes out to some few percentage points faster on something even as simple as this change.


url_rewrite_program /usr/bin/squidguard
url_rewrite_children 32

cache_dir null /tmp
cache_store_log none
cache deny all

acl local_net src 192.168.0.0/16
http_access allow local_net

*** parent_squid squid.conf ***

http_port 192.168.220.2:3128
acl main_squid src 192.168.220.1

http_access allow main_squid
http_access allow manager localhost
http_access allow manager main_squid

icp_access allow main_squid

cache_mem 30 GB
maximum_object_size_in_memory 128 MB
cache_dir aufs /squid 400000 16 256
minimum_object_size 16384 KB
maximum_object_size 1024 MB
cache_swap_low 93
cache_swap_high 98

The numbers here look a little strange. Why the high minimum object size?
I have made some investigation and got the following information

The number of projects for 2013
# svn export http://svn.example.net/2013/03/ ./
# find . -maxdepth 1 -type d -name "*" | wc -l
1481

The number of psd files
# find . -type f -name "*.psd" | wc -l
8680

Total amount of all psd files
# find . -type f -name '*.psd' -exec ls -l {} \; | awk '{ print $5}' |
awk '{s+=$0} END {print s/1073741824" Gb"}'
116.6 Gb

The number of psd files less than 10 Mb
# find . -type f -name '*.psd' -size +10000k -size -100000k | wc -l
915

The number of psd files between 10-50 Mb
# find . -type f -name '*.psd' -size +10000k -size -50000k | wc -l
5799

The number of psd files between 50-100 Mb
# find . -type f -name '*.psd' -size +50000k -size -100000k | wc -l
1799

The number of psd files larger than 100 Mb
# find . -type f -name '*.psd' -size +50000k -size -100000k | wc -l
167

So as you can see ~87% of all psd files 10-100 Mb

Overlooking the point that yoru minimum is 16MB, and maximum is 1024MB (a rerun with those numbers would be worth it).

The total size of *all* the .psd files is relevant to what I focus on, which is smaller than your available disk space. So limiting on size is counterproductive at this point. You are in a position to cache 100% of the .psd files and gain from re-use of even small ones.

Also, the .zip files you are also caching matter. I expect that will make my point invalid. But you should know their parameters like the above experiment as well.


refresh_pattern \.psd$ 2592000 100 2592000 override-lastmod
override-expire ignore-reload ignore-no-cache
refresh_pattern \.zip$ 2592000 100 2592000 override-lastmod
override-expire ignore-reload ignore-no-cache

All work fine, until I uncomment on main_squid the following line

tcp_outgoing_address yyy.yyy.yyy.239

When I try to download any zip file from amazon I see the following
message in cache.log

2013/04/22 01:00:41| TCP connection to 192.168.220.2/3128 failed

If I run tcpdump on yyy.yyy.yyy.239 I see that main_squid trying to
connect to parent via external interface without success.

So my question. How may I configure main_squid that it could connect
to the parent even with configured
tcp_outgoing_address option?

#3 The failure is in TCP. Probably your firewall settings forbidding
yyy.yyy.yyy.239 from talking to 192.168.220.2.
No, there are no forbidding from firewall. As I describe above, all
packets with src ip yyy.yyy.yyy.239 go through table ISP2 that looks
like

# ip ro sh table ISP2
yyy.yyy.yyy.0/24 dev bond1.2000  scope link src yyy.yyy.yyy.239
default via yyy.yyy.yyy.254 dev bond1.2000

This is the routing selection. Not the firewall permissions rules.

and that's a problem. I see the following packets on my external interface

# tcpdump -nnpi bond1.2000 port 3128
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond1.3013, link-type EN10MB (Ethernet), capture size 65535 bytes
13:19:43.808422 IP yyy.yyy.yyy.239.36541 > 192.168.220.2.3128: Flags
[S], seq 794807972, win 14600, options [mss 1460,sackOK,TS val
3376000672 ecr 0,nop,wscale 7], length 0
13:19:44.807904 IP yyy.yyy.yyy.239.36541 > 192.168.220.2.3128: Flags
[S], seq 794807972, win 14600, options [mss 1460,sackOK,TS val
3376001672 ecr 0,nop,wscale 7], length 0
13:19:46.807904 IP yyy.yyy.yyy.239.36541 > 192.168.220.2.3128: Flags
[S], seq 794807972, win 14600, options [mss 1460,sackOK,TS val
3376003672 ecr 0,nop,wscale 7], length 0

So as I understand connection to my parent go through table ISP2
(because tcp_outgoing_address set src ip for the packets to the
yyy.yyy.yyy.239) and external interface bond1.2000 when I expected
that it would be established via internal interface bond0.

The golden question then is whether you see those packets arriving on the parent machine?
 And what happens to them there.

Amos




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux