Tony Dodd wrote:
Chris Robertson wrote:
First of all, thanks for sharing the write-up. There are a number of
high-load squid installations (Wikipedia, and Flikr are two of the
largest I know of), but not much information on what tweaks to make in
the interest of performance.
No problem. =] I encountered the same problem when trying to figure out
how to get more performance so I figured once I'd cracked it, the least
I could do was document it for the other people having the same issue
(and to give myself a reference for later).
After perusing your posting, I'm wondering if you would run a
"squidclient -p 80 mgr:info |grep method". I'm making the assumption
that your squid is listening on port 80, so please change the argument
to -p if needed. Your configuration options included "--enable-poll",
but with a 2.6 kernel and 2.6 sources, I would be surprised if you are
not actually using epoll. It might be a superfluous compile option.
[root@cache1 ~]# squidclient -p 8081 mgr:info |grep method
IO loop method: poll
Hmm, as Adrian said, try adding --enable-epoll to your options, that
should theoretically have a similar difference over poll as aufs has
over ufs.
Also, since you are building from source, try the absolute latest 2.6
around. There is an ongoing optimisation work by Adrian underway that is
showing some noticible speed improvements across the 2.6-teens.
Cache digests are not the only method of sharing between peers. ICP
is an alternative and I have read that multicast works well for
scaling beyond a handful of peers. I can't seem to find the posting
now that I want to reference it. I'd trust your experience over my
memory of someone else's posting, but I thought I would raise the
possibility.
I was under the impression that when utilizing cache peering, it worked
better if the squids had a digest of what was on X squid server, before
asking for it. I could be wrong on that though - Adrian, care to
comment on this one? It's now redundant in my situation though, as
every peering mechanism is slower than going back to parent in our use
case.
Theoretically yes. Practically ... there are incremental and cyclic
digest methods. The former is not much better than multicast ICP. The
later suffers from periodic minor update delays. But none have been
adequately benchmarked in squid IFAIK.
... Side project anyone?
I'm surprised you had to specify your hosts file in your squid.conf.
/etc/hosts is the default.
There are a couple of bugs in squid that seem to cause issues if you
don't actually specify the hosts file within the squid conf... worst
case, it's an extra line of config to parse on startup.
Are these bugs in bugzilla? Please add asap if not.
Lastly, I'd be wary of specifying dns_nameservers as a squid.conf
option. Squid will use the servers specified in /etc/resolv.conf if
this option is not specified. Now you have to maintain name servers
in two locations.
Same goes here; DNS lookups were taking 200-1000ms without specifying
dns_nameservers in the config (the nameservers specified there are the
same ones within /etc/resolv.conf), now they're sub 1ms. There isn't
much chance of us re-ip-ing internally, so it's a pretty safe config
option for us. I definitely agree that it could cause problems for
people using public DNS resolution though.
Hmm, glad it works for you.
I think it might have something to do with other settings in
resolv.conf, namely 'search' and 'domain' which can result in NXDOMAIN
results leading to several lookups. The default may also include a
legacy host name lookup, where dns_nameservers might cause a bypass
(although I don't have time to check the code and confirm that).
Worth a note, though. This is going into my todo pile for a later check.
Amos