On 07/01/11 19:08, Drunkard Zhang wrote:
In order to get squid server 400M+ traffic, I did these: 1. Memory only IO bottleneck is too hard to avoid at high traffic, so I did not use harddisk, use only memory for HTTP cache. 32GB or 64GB memory per box works good.
NP: The problem in squid-2 is large objects in memory. Though the more objects you have cached the slower the index lookups (very, very minor impact).
2. Disable useless acl I did not use any acl, even default acls: acl SSL_ports port 443 acl Safe_ports port 80 # http acl Safe_ports port 21 # ftp acl Safe_ports port 443 # https acl Safe_ports port 70 # gopher acl Safe_ports port 210 # wais acl Safe_ports port 1025-65535 # unregistered ports acl Safe_ports port 280 # http-mgmt acl Safe_ports port 488 # gss-http acl Safe_ports port 591 # filemaker acl Safe_ports port 777 # multiling http acl Safe_ports port 901 # SWAT http_access deny !Safe_ports http_access deny CONNECT !SSL_ports squid itself do not do any acls, security is ensured by other layers, like iptables or acls on routers.
Having the routers etc assemble the packets and parse the HTTP-layer protocol to find these details may be a larger bottleneck than testing for them inside Squid where the parsing has to be done a second time anyway to pass the request on.
Note that the default port and method ACL in Squid are validating on the HTTP header content URLs not the packet destination port.
3. refresh_pattern, mainly cache for pictures Make squid cache as long as it can, so it looks likes this: refresh_pattern -i \.(jpg|jpeg|gif|png|swf|htm|html|bmp)(\?.*)?$ 21600 100% 21600 reload-into-ims ignore-reload ignore-no-cache ignore-auth ignore-private 4. multi-instance I can't get single squid process runs over 200M, so multi-instance make perfect sense.
Congratulations, most can't get Squid to go over 50MBps per instance.
Both CARP frontend and backend (for store HTTP files) need to be multi-instanced. Frontend configuration is here: http://wiki.squid-cache.org/ConfigExamples/ExtremeCarpFrontend I heard that squid is still can't process "huge" memory properly, so I splited big memory into 6-8GB per instance, which listens at ports lower than 80. And on a box with 32GB memory CARP frontend configs like this: cache_peer 192.168.1.73 parent 76 0 carp name=73-76 proxy-only cache_peer 192.168.1.73 parent 77 0 carp name=73-77 proxy-only cache_peer 192.168.1.73 parent 78 0 carp name=73-78 proxy-only cache_peer 192.168.1.73 parent 79 0 carp name=73-79 proxy-only 5. CARP frontend - cache_mem 0 MB I used to use "cache_mem 0 MB", time flies, I think that files smaller than 1.5KB would be waste if GET from CARP backend, am I right? I use these now: cache_mem 5 MB maximum_object_size_in_memory 1.5 KB
The best value here differs on every network so we can't answer your question with details.
Log analysis of live traffic will show you the amount of objects your Squid are handling in each size bracket. That will determine where the best place to set this limit at to reduce the lag on small items versus your available cache_mem memory.
6. LAN, WAN seperates Again, to split load on NIC. Use LAN for clients and CARP interaction, WAN to fetch content from internet. 7. Using official NIC driver. Sometimes chip vender's official driver acts better behavior than builtin driver, so it's worth to try. 8. Based on gentoo Using gentoo, we can strip useless function as much as possible, make the cache system thinner, and faster. 9. Strip useless compile options and runtime options Proper CFLAGS and LDFLAGS are needed, here's one good doc: http://en.gentoo-wiki.com/wiki/Safe_Cflags ~ # squid -v Squid Cache: Version 2.7.STABLE9 configure options: '--prefix=/usr' '--build=x86_64-pc-linux-gnu' '--host=x86_64-pc-linux-gnu' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib' '--libdir=/usr/lib64' '--sysconfdir=/etc/squid' '--libexecdir=/usr/libexec/squid' '--localstatedir=/var' '--datadir=/usr/share/squid' '--disable-auth' '--disable-delay-pools' '--enable-removal-policies=lru,heap' '--enable-ident-lookups' '--enable-useragent-log' '--enable-cache-digests' '--enable-referer-log' '--enable-http-violations' '--with-pthreads' '--with-large-files' '--enable-wccpv2' '--enable-htcp' '--enable-carp' '--enable-icmp' '--enable-follow-x-forwarded-for' '--enable-x-accelerator-vary' '--enable-kill-parent-hack' '--enable-cachemgr-hostname=squid37' '--enable-err-languages=English' '--enable-default-err-language=English' '--with-maxfd=65535' '--without-libcap' '--disable-snmp' '--disable-ssl' '--enable-storeio=ufs,diskd,coss,aufs,null' '--enable-async-io' '--enable-linux-netfilter' '--disable-linux-tproxy' '--enable-epoll' 'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu' 'CC=x86_64-pc-linux-gnu-gcc' 'CFLAGS=-march=barcelona -mtune=barcelona -O2 -pipe' 'LDFLAGS=-Wl,-O1 -Wl,--as-needed' 10. sysctl tune net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_syn_retries = 3 net.ipv4.tcp_synack_retries = 3 net.ipv4.tcp_max_syn_backlog = 4096 net.core.netdev_max_backlog = 4096 net.ipv4.ip_local_port_range = 1024 65534 net.netfilter.nf_conntrack_max = 1048576 net.netfilter.nf_conntrack_tcp_timeout_established = 1000 net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_sack = 0 net.ipv4.tcp_low_latency = 1 net.ipv4.tcp_fin_timeout = 15 net.ipv4.tcp_keepalive_intvl = 30 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_time = 1800 net.ipv4.tcp_max_orphans = 16384 net.ipv4.tcp_orphan_retries = 1 net.ipv4.ipfrag_high_thresh = 524288 net.ipv4.ipfrag_low_thresh = 262144 kernel.pid_max = 65535 vm.swappiness = 1 net.ipv4.tcp_mem = 6085248 8113664 12170496 net.ipv4.tcp_wmem = 4096 65536 8388608 net.ipv4.tcp_rmem = 4096 87380 8388608 net.core.rmem_default = 8388608 net.core.rmem_max = 8388608 net.core.wmem_default = 8388608 net.core.wmem_max = 8388608 net.core.somaxconn = 512 net.ipv4.udp_mem = 6194688 8259584 12389376 net.ipv4.udp_rmem_min = 8192 net.ipv4.udp_wmem_min = 8192 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1 This is all I did to get high performance, what should I do to get even better performance, any more advice?
You mentioned a backend storing files. If that is using disk storage the use of COSS there (or the RockStore squid-3 branch) will speed up the disk IO by reducing the number of small ops.
I can't comment on the sysctl settings other than to say swapping is BAD for Squid performance.
Amos -- Please be using Current Stable Squid 2.7.STABLE9 or 3.1.10 Beta testers wanted for 3.2.0.4