On 4/08/2013 5:22 p.m., Tyler Sweet wrote:
Hello,
My second message to the mailing list :)
I've run into some problems when it comes to having two squid boxes
configured to be siblings to each other. I wasn't able to pull much
data about what happened, but I can sum it up for you here and then
try to replicate it back when I get access to my home lab again.
We're handling about 100-200 requests a second, mainly medium to small
files, with the occasional 2+GB game update or so. What we saw
happening, even under medium to low load (less than 50 users, probably
closer to 10-20 requests a second) was that when both squid servers
were set up with each other as a cache peer, one or more squid
processes would start to eat memory. Eventually, they would either eat
enough by themselves (22GB) or 4-7 together (each with 4 or more GB of
memory in RSS) to cause the server to run into out of memory
conditions and kill squid.
Which *exact* release versions have you observed this behaviour in?
Originally, I though this was caused by my self-compiled version of
Squid 3.4 on FreeBSD, and since I was low on time and had no time to
look into it further, I reloaded the servers to CentOS 6.4 and used
the repo listed on the squid site to install squid 3.3.8. The problem
persisted, and without any time to troubleshoot I simply disabled the
cache-peer configurations.
I'm pretty sure I've messed up the configuration somehow. Here are
what I think are the relevant config settings I've been using:
# Squid Boxen #################
acl siblings src 172.16.1.91
acl siblings src 172.16.1.90 # Local server
# Cache Peers
htcp_port 4827
htcp_access allow siblings
htcp_clr_access allow siblings
htcp_access deny all
htcp_clr_access deny all
# Sibling
cache_peer 172.16.1.91 sibling 3128 4827 htcp
cache_peer_access 172.16.1.91 deny STEAM_CONTENT
cache_peer_access 172.16.1.91 allow all
Now, looking at the config I feel like I should probably have set the
"siblings" acl separately on both servers, to deny HTCP access from
looping around.
Yes in each config it should define only the IP of the other sibling.
NP: IIRC there is a bug still in the CLR handling causing Squid to loop
CLR requests between the peers indefinitely. That should not eat up so
much memory, but might eat bandwidth.
But I don't know if that looping would have had this
affect or not, nor do I remember seeing anything in the logs about
looping happening. Can anyone offer some guidance on this? Is it
simply that I messed up the initial configuration?
The above part looks fine by itself.
The main thing in Squid that controls forwarding loops is "via on".
Which is the default. I assume you have not disabled that.
The backup you can add is a cache_peer_access deny line preventing
sending to the peer requests that came from there in the first place
(cache_peer_access 172.16.1.91 deny siblings).
Amos