Re: Squid on DualxQuad Core 8GB Rams - Optimization - Performance - Large Scale - IP Spoofing

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Mon, 15 Oct 2007 02:13:59 +1300

Haytham KHOUJA (devnull) wrote:
Hello,
The purpose of this thread is to join forces to have the best Squid 
configuration for generic affordable Intel machines available by major 
vendors (Dell/HP...) specifically for ISPs and corporations that want a 
basic setup but with optimal response and throughput and maximizing 
bandwidth savings.
I work for an important ISP and I currently replaced 2 NetApp NetCache 
with 3 Dell 2950 hooked up on a Foundry Switch for Load Balancing.
I used tproxy to enable IP Spoofing to IP spoofing the outgoing address 
with some configurations on the Cisco core router, I had to compile 
iptables and tproxy on a Debian kernel source (2.6.18)

I've read almost every single thread on Optimizing Squid and Linux and 
want to share my setup with you.
I do have some questions, clarifications and bugs but overall the 
performance is pretty impressive. (Yes, much better than the NetApps)

What i want to do is since i have 8 GB of RAMs, i want to store more hot 
objects in the RAMs to maximize Memory hit ratio, but with my setup, 
Squid doesn't
go above 2GB~3GB of usage. (Remember, that there are no other heavy 
processes on the machine).

You will need a 64-bit enabled squid to go higher than 2GB.

If i knew beforehand that Squid doesn't make use of SMP, i wouldn't have 
bought Dual Quad Core and would have invested in Intel CPUs with 8mb of 
Cache, but what's done is done :)

Before i had Squid go down because of File Delimiters and maximum open 
files and ip_conntrac fill up, i fixed both with some iptables and 
sysctl configuration.
Now i'm hitting a "Oct 14 01:17:06 proxy4 squid[8883]: assertion failed: 
diskd/store_io_diskd.c:384: "!diskdstate->flags.close_request" Error, so 
Squid kills and restarts (which flushes the Memory cache).

I'm looking forward for some contributions, idea sharing, knowledge 
correcting to make this setup a standard setup for large scale, well 
optimized and high performant Squid for future tweakings. I hope this 
configuration would be then uploaded to the Squid wiki.

Post your squid.conf to
  http://squid.treenet.co.nz/cf.check/
and review the results. I've pointed out the biggest worries below.

Here's my setup:
Dell 2950
Dual Quad Core 2.4Ghz / 8 GB Rams / 4x 136 GB 15000 RPM drives

I have 3 cache_dir on separate drives and I formated the 3 disks with 
ReiserFS:
   /dev/sdb1       /CACHE1 reiserfs notail,noatime         0 0
   /dev/sdc1       /CACHE2 reiserfs notail,noatime         0 0
   /dev/sdd1       /CACHE3 reiserfs notail,noatime         0 0

I run Debian GNU/Linux Etch and compiled Squid with the following:
Squid Cache: Version 2.6.STABLE16
configure options:  '--bindir=/usr/bin' '--sbindir=/usr/sbin/' 
'--sysconfdir=/etc' '--enable-icmp' '--enable-snmp' '--enable-async-io' 
'--enable-linux-netfilter' '--enable-linux-tproxy' '--with-dl' 
'--with-large-files' '--enable-large-cache-files' '--with-maxfd=1000000' 
'--enable-storeio=diskd,ufs' '--with-aio' '--enable-epoll' 
'--disable-ident-lookups' '--enable-removal-policies=heap' 
'CFLAGS=-DNUMTHREADS=120'

As you can see i have the following modules enabled: linux-tproxy, 
diskd, epoll, and removal policies.
/dev/epoll improves network I/O performance, Diskd separates disk I/O to 
separate processes (which reduces process locking from Squid to write on 
disks), and read benchmarks for memory and disk removal policies.

aufs does a better job, particularly where threads are available and is 
not quite so broken as diskd.

My /etc/squid.conf is composed of the following:

http_port 80 transparent tproxy
tcp_outgoing_address IP of the Machine
:: Those are for IP Spooding and Transparency

via off
forwarded_for off
:: Those are for total transparency, remote hosts will never guess that 
the request came from a proxy

IIRC, theres more than this needed for complete silence. They just 
replace the Via and Forwarded-For with text 'unknown'. still leaving the 
headers in place for anon-proxy identification.

cache_mem 600 MB
:: A bit confused about this, When i go higher than 2GB, Squid kills 
with a "out of memory" error. I have 8GB and want to maximize the use of 
it.

cache_effective_user nobody
cache_effective_group nogroup
:: Security and bla bla

This is the default UID. If this is going to be a standard config these 
MUST not be explicitly set.
Also when GID is configured as above, will in fact cause a 
squid-specific deviation from the configured OS-level security policy.

They are no longer to be used, unless the machine-specific setup 
requires it AND the admin knows how to setup for them properly.

cache_replacement_policy heap LFUDA
memory_replacement_policy heap GDSF
:: Very objective, you can google about them

cache_dir diskd /CACHE1 61440 16 256 Q1=144 Q2=128
cache_dir diskd /CACHE2 61440 16 256 Q1=144 Q2=128
cache_dir diskd /CACHE3 61440 16 256 Q1=144 Q2=128
:: DISKD configuration, i'm only using 60GB of each disk

cache_access_log /var/log/squid/access.log

Obsolete option. Use access_log with same parameters instead.

cache_log /var/log/squid/cache.log
cache_store_log none
:: No need to log cache_store, so minimizing the Disk I/O

fqdncache_size 51200
ipcache_size 51200
:: Caching IPs/Domain Name and whatnot

pipeline_prefetch on
:: Performance enhancement

shutdown_lifetime 1 second
:: Tired to wait whenever i restart my Squids (Only on testing)

read_ahead_gap 60 KB
maximum_object_size 2 GB
minimum_object_size 0 KB
maximum_object_size_in_memory 128 KB
cache_swap_high 80%
cache_swap_low 70%
half_closed_clients off
memory_pools on
positive_dns_ttl 24 hours
negative_dns_ttl 30 seconds
request_timeout 60 seconds
connect_timeout 30 seconds
pconn_timeout 30 seconds
ie_refresh on
dns_nameservers DNS1 DNS2
emulate_httpd_log off
log_ip_on_direct on
debug_options ALL, 9

performance enhancements above to minimize disk IO yet you log 
everything at full-debug? this *,9 could cause extremely high disk usage 
under load. Try *,1 (minimal) or *,5 (detailed overview) instead.

pid_filename /var/run/squid.pid

My IPtables/sysctl and startup file:
#!/bin/sh
iptables -t tproxy -A PREROUTING -i eth0 -p tcp -m tcp --dport 80 -j 
TPROXY --on-port 80
:: I run Squids on port 80 so that i can forward all incoming requests 
on port 80 to the Squids on the Cisco router level

echo 1 > /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind
echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter
echo 1024 65535 > /proc/sys/net/ipv4/ip_local_port_range
echo 102400  > /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 1000000 > /proc/sys/net/ipv4/ip_conntrack_max
echo 1000000 > /proc/sys/fs/file-max
echo 60 > /proc/sys/kernel/msgmni
echo 32768 > /proc/sys/kernel/msgmax
echo 65536 > /proc/sys/kernel/msgmnb
:: Maximizing Kernel configuration

ulimit -HSn 1000000
/etc/init.d/squid stop
/etc/init.d/squid start
:: Re-enforcing ulimit parameters for the Squid process.

Thank you

No, thank you.

Amos