On 17/08/2013 6:45 a.m., inittab wrote:
Hello,
I wanted to get some suggestions on my current setup and ask if i'm
expecting too much out of my hardware for the traffic load.
Sorry for the slow reply.
NOTE: If you determine that it is a memory leak, please upgrade to the
current Squid-3.3 or later versions. There are a few dozen leaks in 3.1
and 3.2 series of various sizes which have been fixed. Not everybody is
hitting them due to specific behaviour causing each one, but you may be.
it appears i am running into out of memory problems and hitting swap,
squid processes then end up dying out.
[root@squid01 squid]# dmesg | grep "page allocation"
swapper: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
kswapd0: page allocation failure. order:1, mode:0x20
squid: page allocation failure. order:1, mode:0x20
I currently have 2 dell 2950's running squid 3.1.10, we generally see
~200Mbps total.
How many HTTP requests/second is the most relevant traffic speed metric
for Squid.
FYI: 200Mbps of traffic coudl be coming from 1 single HTTPS / CONNECT
request per day, or from a million IMS requests. The effect on and by
Squid CPU and memory is drastically different for each of those cases
and varies greatly for all permutations in between.
Each request requires soem KB amount of buffer memory - 1 request/day vs
a million requests/day and you can see where the relevance starts to
appear for your particular problem.
box stats are:
2x Six-Core AMD Opteron(tm) Processor 2427 @2.2Ghz
32gb ram
1x Intel E1G44HTBLK Server Adapter I340-T4 all 4 ports bonded with 802.3ad
/var/spool/squid 512G raid5
Ah. RAID. Well there is some more disk I/O overheads you could possibly
avoid:
http://wiki.squid-cache.org/SquidFaq/RAID
Keep in mind tha the cache data is effectively a local _backup_ of data
elsewhere. It is non-critical. The only benefit you gain from RAID is
advance warning about disk failures and some time to correct them
without Squid crashing.
The boxes are both running 10 squid processes on different ports in
transparent mode
I am using iptables rules to redirect traffic to the different squid ports ex:
22M 1351M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3120
20M 1216M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3121
18M 1094M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3122
16M 985M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3123
15M 886M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3124
13M 798M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3125
12M 718M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3126
11M 647M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3127
9631K 582M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3128
8668K 524M REDIRECT tcp -- * * 10.96.0.0/15
0.0.0.0/0 statistic mode random probability 0.100000 tcp
dpt:80 redir ports 3129
sysctl.conf:
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.netfilter.nf_conntrack_max = 196608
example squid config file: squid-p3120.conf
acl adminnet src 10.3.25.0/24
acl proxyvlan src 10.5.22.0/24
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
http_access allow manager localhost
http_access allow manager adminnet
http_access allow manager proxyvlan
http_access deny manager
For high speed Squid-3.2 or later I am recommending that people at least
place the manager ACL tests down ...
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access deny to_localhost
... here. So that the faster port rejections can protect better against
some DDoS issues. ("manager" has become a regex test.).
http_access allow localhost
http_access allow customers
NP: given the above ACLs are all allow, you could in this proxy even
move the manager allow lines down to here. They will be far rarer than
your normal client traffic I think.
http_access deny all
hierarchy_stoplist cgi-bin ?
You can simplify the config by removing hierarchy_stoplist.
coredump_dir /var/spool/squid/p3120
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
hosts_file /etc/hosts
dns_nameservers 10.5.7.13 10.5.7.23
cache_replacement_policy heap LFUDA
cache_swap_low 90
cache_swap_high 95
maximum_object_size_in_memory 96 KB
maximum_object_size 100 MB
cache_dir aufs /var/spool/squid/p3120 204800 16 256
cache_mem 100 MB
logfile_rotate 10
memory_pools off
It does vary between installations, but memory_pools can offer reduction
on a lot of memory allocator overheads when it is enabled.
quick_abort_min 0 KB
quick_abort_max 0 KB
log_icp_queries off
client_db off
buffered_logs on
half_closed_clients off
url_rewrite_children 20
pid_filename /var/run/squid-p3120.pid
unique_hostname squid01-p3120.eng.XXXXXX
visible_hostname squid.eng.XXXXXXX
icp_port 3100
tcp_outgoing_address 10.5.22.101
emulate_httpd_log on
Anyone have any suggestions on whether or not i'm doing something
terribly wrong her or missing some kind of performance tuning?
Your memory requirements in MB of RAM per proxy are:
100 (cache_mem) + 15*0.1 (cache_mem index) + 15*205 (cache_dir index)
+ 0.25 * R (active request buffers)
I note that this is already 3.1GB just for the index values. So 10
proxies will be only leaving ~1GB of RAM for the operating system use,
other processes, and Squids active request buffering.
I suggest dropping the cache_dir size to 100000 and measure the RAM
usage on the box to see how much you can increase it back up.
Amos