Re: "proxy_balancer" | stickysession

Rainer Jung <rainer.jung@xxxxxxxxxxx> · Wed, 27 Oct 2010 17:12:22 +0200

On 27.10.2010 16:10, King Holger (CI/AFP2) wrote:

we need your help :) We re-discovered a Tomcat-connection switch problem in combination with Apache2 2.2.16 and "mod_proxy_balancer". This time, we saved the Apache2 ERROR-LOG!

The Apache2-ACCESS log documents the connection switch from one Tomcat named "rb-wcmstc1" to the other one "rb-wcmstc2":
10.35.32.123 - - [25/Oct/2010:11:50:04 +0200] "POST /servlet/ClientIO/1ojgw1th835ue/s6/y1 HTTP/1.1" 500 1258 "-" "Jakarta Commons-HttpClient/3.1" "JSESSIONID=A62BCC2B1054CA532C68CA409F41548C.rb-wcmstc1" "A62BCC2B1054CA532C68CA409F41548C.rb-wcmstc1" "JSESSIONID=56DB518EDE4EFED6337349F3478A4863.rb-wcmstc2; Path=/" 2795

Apache2-ERROR-Log Output:
[Mon Oct 25 10:18:58 2010] [error] server is within MinSpareThreads of MaxClients, consider raising the MaxClients setting
[Mon Oct 25 11:46:35 2010] [error] server reached MaxClients setting, consider raising the MaxClients setting
[Mon Oct 25 11:50:02 2010] [error] (70007)The timeout specified has expired: ajp_ilink_receive() can't receive header

OK, here the last line is relevant. So the failover is happening, 

because Apache doesn't get the answer from wcmstc1.

Since we see, that Apache created all its allowed processes shortly 

before, it is likely that all threads in Tomcat are busy.

Our assumption: a bottleneck within the Apache2 connection pool (http://www.gossamer-threads.com/lists/apache/users/343190)???

When monitoring the Apache2 via the "/server-status" resource, the following information is shown (see screenshot enclosed). As you can see:
- the total amount of requests (sum of requests and idle workers) possibly being handled in parallel: 400
- the amount of idle workers sometimes reaches the critical value: 0 (not documented in the screenshot enclosed)

After checking the Apache2 central configuration file for the Apache2-MPM ("httpd-mpm.conf"), it does NOT include that resource:
# Server-pool management (MPM specific)
#Include conf/extra/httpd-mpm.conf

Currently, we use the "worker"-MPM as shown in the "apachectl"-output (AND NOT PREFORK).
<hostname>:/opt/wcms/apache/bin $ ./apachectl -M | grep -i "worker"
Syntax OK
  mpm_worker_module (static)

Our questions:
- when not including the "httpd-mpm.conf" (containg configurations for different MPMs), which values are used?

Maximum 16 processes, each 25 threads.

- why 400 requests? We assumed a default value of 150 for "MaxClients" (see "httpd-mpm.conf")

Unfortunately the values in httpd-mpm.conf do not reflect the defaults 

set in the code.

- should we increase the "MaxClients"-value by explicitly defining it in "httpd-mpm.conf" (as mentioned in the "error.log" above)?

You could, but make sure that you remove the comment sign before the 

Include statement.

When monitoring one of the Tomcats below "/manager/status" (here: "rb-wcmstc2"), the screenshot enclosed shows the following information:
- MaxThreads: 500
- ThreadCount: 252
- Current Thread Busy: 239

The configuration for the Tomcats (here "rb-wcmstc2"):
<!-- Define an AJP 1.3 Connector on port 8009 -->
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" allowTrace="true" maxThreads="500" />

Consequence:
- it seems to be clear, that Apache2 is the bottleneck (the amount of connections available within the "connection-pool" does sporadically reach the value of "0")

You could increase the allowed concurrency for Apache. But: often the 

problem is not the limited concurrency, but that some application is to 

slow. As a rule of thumb:

Needed concurrency = Throughput * Response Time

So if you want to handle 100 requests per second and assume an average 

response time of 0.5 seconds, you would experience a typical needed 

concurrency of 50. As always you should add some headroom to the numbers 

you calculate.

Typically if you run into a performance problem, the response times 

explode, and the number of incoming requests stays high. That means the 

needed concurrency also explodes. It doesn't make sense in this case to 

increase the available concurrency by e.g. a factor of 10. Instead you 

need to find out, why things are getting slow.

So: Monitor your Apache Usage (server-status). If you are typically e.g. 

in the range between 300 and 400 and only sometimes you hit the 400. 

increasing the available concurrency might be OK (remember that Tomcat 

also needs enough threads). If your typical concurrancy is about 10 or 

50 and only sometimes it hits 400, then increasing will not help. 

Instead you will need to find out what happens when the numbers get to big.

First check in your acess log using "%D", that in fact the response 

times get bigger and not really the load (requests per second). Do the 

check for Apache and Tomcat and compare the numbers.

Then once you know that growing response times are the problem, the best 

approach is using Java thread dumps for Tomcat to analyze, why the app 

gets slow.

Regards,

Rainer

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx