RE: Parent Proxy's, Not Failing over when Primary Parent is Down.

"Dean Weimer" <dweimer@xxxxxxxxxxxx> · Thu, 30 Apr 2009 15:29:06 -0500

-----Original Message-----
From: crobertson@xxxxxxx [mailto:crobertson@xxxxxxx] 
Sent: Thursday, April 30, 2009 2:13 PM
To: squid-users@xxxxxxxxxxxxxxx
Subject: Re:  Parent Proxy's, Not Failing over when Primary
Parent is Down.

Dean Weimer wrote:
> I have a current Parent child proxy configuration I have been testing,
its working with the exception of some sites not failing over to second
parent when primary parent goes down.
>
> In the test scenario I have 2 parent proxies, and one child proxy
server, the parents are each configured twice using an alias IP address.
This is done to load balance using round robin for the majority of web
traffic yet allow some sites that we have identified to not work
correctly with load balancing to go out a single parent proxy.
>   

Since Squid 2.6 there has been a parent selection method called 
"sourcehash", which will keep a client-to-parent-proxy relationship 
until the parent fails.

I considered this, but was concerned that after a failed proxy server,
the majority of my load would be on one server, and not taking advantage
of both links when the problem is resolved.

> The load balanced traffic works as expected, the dead parent is
identified and ignored until it comes back online.  The traffic that
cannot be load balanced is all using HTTPS (not sure HTTPS has anything
to do with the problem or not), when I stop the parent proxy 10.50.20.7
(aka 10.52.20.7) the round-robin configuration is promptly marked as
dead.  However a website that has already been connected to that is in
the NONBAL acl just returns the proxy error from the child giving a
connect to (10.52.20.7) parent failed connection denied.

Hmmm...  You might have to disable server_persistent_connections, or 
lower the value of persistent_request_timeout to have a better response 
rate to a parent failure with your current setup.

Also considered this, but figured it would break some sites that are
working successfully with load balancing because they create a
persistent connection, and making the request timeout to low would
becoming annoying to the users.  Also as the default is listed at 2
minutes, I noticed that even after as much as 5 minutes that the
connection would not fail over.

>   It will not mark the non load balanced parent dead, closing and
restarting the browser doesn't help.  It will change the status to dead
however if I connect to another site in the NONBAL acl.  Going back to
the first site, I can then connect, even though I have to log in again,
which is expected and why these sites cannot be load balanced.
>
> Does anyone have any ideas short of writing some sort of test script
that will cause the parent to be marked as dead, if it fails without any
user intervention.
>
> Here is the cache peer configuration from the child proxy. FYI, I
added the 5 sec timeout to see if it had any effect, and it didn't with
the exception of speeding up the detection of the dead load balanced
proxy.
>
> ## Define Parent Caches
> # Cache Peer Timeout
> peer_connect_timeout 5 seconds
> # Round Robin Caches
> cache_peer 10.50.20.7 parent 8080 8181 name=DSL2BAL round-robin
> cache_peer 10.50.20.6 parent 8080 8181 name=DSL1BAL round-robin
> # Non Load Balanced caches
> cache_peer 10.52.20.7 parent 8080 8181 name=DSL2
> cache_peer 10.52.20.6 parent 8080 8181 name=DSL1
>
> ## Define Parent Cache Access rules
> # Access Control Lists
> acl NONBAL dstdomain "/usr/local/squid/etc/nonbal.dns.list
> # Rules for the Control Lists
> cache_peer_access DSL2BAL allow !NONBAL
> cache_peer_access DSL1BAL allow !NONBAL
> cache_peer_access DSL2 allow NONBAL
> cache_peer_access DSL1 allow NONBAL
>
> Thanks,
>      Dean Weimer
>      Network Administrator
>      Orscheln Management Co

Chris

I am currently doing some testing by creating access control lists for a
couple nonexistent sub domains on our own domain.  This then just
accesses the error page from the parent proxy for nonexistent domain, so
it shouldn't put an unnecessary load on the internet links testing.
Then allowing each one through one of the non balanced parents.  By
accessing that page with my browser it causes the parent to be marked
dead.

I could look at writing a script to access these pages through the child
proxy every so many seconds to cause the parent to be marked as dead.
It's kind of a hacked solution, but hopefully it would keep the users
from having too much down time in the event that one proxy goes down.

It would probably be preferable though to query ICP directly and then do
a reconfigure on the child squid to exclude that parent from its
configuration.  If anyone can tell me where to find the information on
how to do an ICP query that would save me some time, and be greatly
appreciated, in the mean time I will start  searching or worse yet if
that fails sniffing network traffic to write an application to mimic the
squid query.