Hey everyone. We have been running Apache 2.2 as a reverse proxy cache server for some time, and in general it performs great. We have one nagging problem however. Currently we have a cluster of 70+ back-end web servers that are BalancerMember's. If we apachectl stop on a backend web, we are fine. If however the network dies because of a server crash, arp issues... whatever, the front-end cache hangs until it comes back. This causes all of the other back-end web servers web requests to also hang... and we get a snowball effect that requires a total restart of both the cache and the back-end webs to clear things up. Obviously this is not ideal. This all seems like a TCP timeout issue of some kind. I was perplexed to discover that doing a netstat -anp|grep [backend that died ip] Only showed 8 connections in the SYN_SENT state. I also notice that the Apache balancer-manager scoreboard marks the web server as ERR (appropriately)... yet we are still hanging on something. We have eliminated the cluster fs as the culprit, as network issues on a non-back-end web server do not cause a problem. We would be perfectly happy if a web server not responding for 5 seconds meant it was marked as an error cluster and it didn't try to connect for quite some time. So far we have specified each balancer member to have: retry=120 max=40 We do this because we have some web pages that can legitimately take up to 60 seconds to finish rendering. Is there some kind of timeout we can configure at the OS or apache level that would prevent waiting on a host that has gone completely dark for 5 seconds? Thanks in advance for any advice people have. --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx