Re: Performance Extremely squid configuration advice

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Sat, 08 Jan 2011 04:29:12 +1300

On 07/01/11 19:08, Drunkard Zhang wrote:
In order to get squid server 400M+ traffic, I did these:
1. Memory only
IO bottleneck is too hard to avoid at high traffic, so I did not use
harddisk, use only memory for HTTP cache. 32GB or 64GB memory per box
works good.

NP: The problem in squid-2 is large objects in memory. Though the more 
objects you have cached the slower the index lookups (very, very minor 
impact).

2. Disable useless acl
I did not use any acl, even default acls:
acl SSL_ports port 443
acl Safe_ports port 80          # http
acl Safe_ports port 21          # ftp
acl Safe_ports port 443         # https
acl Safe_ports port 70          # gopher
acl Safe_ports port 210         # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280         # http-mgmt
acl Safe_ports port 488         # gss-http
acl Safe_ports port 591         # filemaker
acl Safe_ports port 777         # multiling http
acl Safe_ports port 901         # SWAT
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports

squid itself do not do any acls, security is ensured by other layers,
like iptables or acls on routers.

Having the routers etc assemble the packets and parse the HTTP-layer 
protocol to find these details may be a larger bottleneck than testing 
for them inside Squid where the parsing has to be done a second time 
anyway to pass the request on.

Note that the default port and method ACL in Squid are validating on the 
HTTP header content URLs not the packet destination port.

3. refresh_pattern, mainly cache for pictures
Make squid cache as long as it can, so it looks likes this:
refresh_pattern -i \.(jpg|jpeg|gif|png|swf|htm|html|bmp)(\?.*)?$
21600 100% 21600  reload-into-ims ignore-reload ignore-no-cache
ignore-auth ignore-private

4. multi-instance
I can't get single squid process runs over 200M, so multi-instance
make perfect sense.

Congratulations, most can't get Squid to go over 50MBps per instance.

Both CARP frontend and backend (for store HTTP files) need to be
multi-instanced. Frontend configuration is here:
http://wiki.squid-cache.org/ConfigExamples/ExtremeCarpFrontend

I heard that squid is still can't process "huge" memory properly, so I
splited big memory into 6-8GB per instance, which listens at ports
lower than 80. And on a box with 32GB memory CARP frontend configs
like this:

cache_peer 192.168.1.73 parent 76 0 carp name=73-76 proxy-only
cache_peer 192.168.1.73 parent 77 0 carp name=73-77 proxy-only
cache_peer 192.168.1.73 parent 78 0 carp name=73-78 proxy-only
cache_peer 192.168.1.73 parent 79 0 carp name=73-79 proxy-only

5. CARP frontend - cache_mem 0 MB
I used to use "cache_mem 0 MB", time flies, I think that files smaller
than 1.5KB would be waste if GET from CARP backend, am I right? I use
these now:

cache_mem 5 MB
maximum_object_size_in_memory 1.5 KB

The best value here differs on every network so we can't answer your 
question with details.

Log analysis of live traffic will show you the amount of objects your 
Squid are handling in each size bracket. That will determine where the 
best place to set this limit at to reduce the lag on small items versus 
your available cache_mem memory.

6. LAN, WAN seperates
Again, to split load on NIC. Use LAN for clients and CARP interaction,
WAN to fetch content from internet.

7. Using official NIC driver.
Sometimes chip vender's official driver acts better behavior than
builtin driver, so it's worth to try.

8. Based on gentoo
Using gentoo, we can strip useless function as much as possible, make
the cache system thinner, and faster.

9. Strip useless compile options and runtime options
Proper CFLAGS and LDFLAGS are needed, here's one good doc:
http://en.gentoo-wiki.com/wiki/Safe_Cflags

~ # squid -v
Squid Cache: Version 2.7.STABLE9
configure options:  '--prefix=/usr' '--build=x86_64-pc-linux-gnu'
'--host=x86_64-pc-linux-gnu' '--mandir=/usr/share/man'
'--infodir=/usr/share/info' '--datadir=/usr/share' '--sysconfdir=/etc'
'--localstatedir=/var/lib' '--libdir=/usr/lib64'
'--sysconfdir=/etc/squid' '--libexecdir=/usr/libexec/squid'
'--localstatedir=/var' '--datadir=/usr/share/squid' '--disable-auth'
'--disable-delay-pools' '--enable-removal-policies=lru,heap'
'--enable-ident-lookups' '--enable-useragent-log'
'--enable-cache-digests' '--enable-referer-log'
'--enable-http-violations' '--with-pthreads' '--with-large-files'
'--enable-wccpv2' '--enable-htcp' '--enable-carp' '--enable-icmp'
'--enable-follow-x-forwarded-for' '--enable-x-accelerator-vary'
'--enable-kill-parent-hack' '--enable-cachemgr-hostname=squid37'
'--enable-err-languages=English'
'--enable-default-err-language=English' '--with-maxfd=65535'
'--without-libcap' '--disable-snmp' '--disable-ssl'
'--enable-storeio=ufs,diskd,coss,aufs,null' '--enable-async-io'
'--enable-linux-netfilter' '--disable-linux-tproxy' '--enable-epoll'
'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu'
'CC=x86_64-pc-linux-gnu-gcc' 'CFLAGS=-march=barcelona -mtune=barcelona
-O2 -pipe' 'LDFLAGS=-Wl,-O1 -Wl,--as-needed'

10. sysctl tune
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_syn_retries = 3
net.ipv4.tcp_synack_retries = 3
net.ipv4.tcp_max_syn_backlog = 4096
net.core.netdev_max_backlog = 4096
net.ipv4.ip_local_port_range = 1024 65534
net.netfilter.nf_conntrack_max = 1048576
net.netfilter.nf_conntrack_tcp_timeout_established = 1000
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_low_latency = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_max_orphans = 16384
net.ipv4.tcp_orphan_retries = 1
net.ipv4.ipfrag_high_thresh = 524288
net.ipv4.ipfrag_low_thresh = 262144
kernel.pid_max = 65535
vm.swappiness = 1
net.ipv4.tcp_mem = 6085248 8113664 12170496
net.ipv4.tcp_wmem = 4096 65536 8388608
net.ipv4.tcp_rmem = 4096 87380 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 8388608
net.core.wmem_default = 8388608
net.core.wmem_max = 8388608
net.core.somaxconn = 512
net.ipv4.udp_mem = 6194688 8259584 12389376
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1

This is all I did to get high performance, what should I do to get
even better performance, any more advice?

You mentioned a backend storing files. If that is using disk storage the 
use of COSS there (or the RockStore squid-3 branch) will speed up the 
disk IO by reducing the number of small ops.

I can't comment on the sysctl settings other than to say swapping is BAD 
for Squid performance.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.10
  Beta testers wanted for 3.2.0.4