Hello,
Thanks for the tips, I implemented them but
unfortunately it did not help.
In our stress test the cluster was brought to its
knees but we still cannot figure out where the bottleneck is.
Things we check using perfmon:
CPU% - average never above 70 on any of the
servers
CPU Queue length - average never above
14
Pages Input / second - nominal
Avg. Disk Read / Write Queue - average never
above 0.5
Network usage - never spikes above 25%
I beleive we may be maxing at badwidth at
10mb. We are upping to 30mb shortly.
The server recovers fine afterwords now though
which is good news.
This is one worker and the lb:
worker.tomcat6.port=16009 worker.tomcat6.host=192.168.150.12 worker.tomcat6.type=ajp13 worker.tomcat6.reply_timeout=30000 worker.tomcat6.lbfactor=100
worker.loadbalancer.type=lb worker.loadbalancer.balanced_workers=tomcat2,
tomcat1,tomcat4,tomcat5,tomcat6
This is the status page under a fresh restart and 4
minutes of heavy load.
Worker Status for loadbalancer Type Sticky
Sessions Force Sticky Sessions Retries LB Method Locking Recover Wait Time Max
Reply Timeouts lb True False 2 Request Optimistic 60 0
Good Degraded Bad/Stopped Busy Max Busy Next
Maintenance 5 0 0 224 248 55/117
Balancer Members [Hide] Name Type Host
Addr Act State D F M V Acc Err CE RE Wr Rd Busy Max Route RR Cd Rs
[E|R] tomcat2 ajp13 localhost:12009 127.0.0.1:12009 ACT OK 0 50 2 602
3665 0 4 0 1.1M 52M 30 37 tomcat2 0/0 [E|R]
tomcat1 ajp13 localhost:11009 127.0.0.1:11009 ACT OK 0 50 2 600 3594 0 8 0 1.1M
52M 30 42 tomcat1 0/0 [E|R] tomcat4 ajp13
192.168.150.12:14009 192.168.150.12:14009 ACT OK 0 100 1 602 7303 0 15 0 2.3M
103M 58 73 tomcat4 0/0 [E|R] tomcat5 ajp13
192.168.150.13:15009 192.168.150.13:15009 ACT OK 0 100 1 601 7173 0 18 0 2.2M
103M 54 73 tomcat5 0/0 [E|R] tomcat6 ajp13
192.168.150.12:16009 192.168.150.12:16009 ACT OK 0 100 1 601 7261 0 18 0 2.3M
101M 51 71 tomcat6 0/0
Any ideas?
Thanks in advance,
--James
----- Original Message -----
Sent: Monday, September 17, 2007 11:01
AM
Subject: Re: Load balancing
question
I think you should use timeout ! It seems that your request
take a long time to be computed by your tomcats. If you reach the max
connections (http or ajp ) then you have to wait for tomcat response to
free a connection slot. What says your jk_status page ? are all your
workers in error state ? how many busy connections do you have ?
You
can : - in httpd.conf :
+if your using keepalive,
add a keepalive timeout. 5,10 or 15 s may be enough.
+ if your using
mpm_winnt, increase ThreadsPerChild value to increase max available
connections.
- in workers.properties :
+
worker.yourworker.reply_timeout=30000. after 30s without
response, the connection will try another worker
or fail.
+
limit your
connection_pool_size if your are in multi thread httpd mode. You may have on connection per thread which can
overload your tomcats.
- in your tomcat : increase
your AJP connectors maxThreads. 200 by default. It's no very efficient to have
too much thread but it can prevent you from refused connections.
--
Bj
http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html
On 9/17/07, James
Sherwood <jsherwood@xxxxxxxxxxxxxxxx>
wrote:
Hello,
CORRECTED(status page working
now)
We upgraded to the latest mod_jk and this were
the results:
1: All monitors were fine, there were no
bottlenecks anywhere that we could find(cpu's,HD's and networks all seemed
fine).
2: This time when we brought the servers to
their knees, they recovered a short time after the test was
completed.
3: We tried the socket_keepalive=true for
the workers and the server did not recover after
4: the only problem we can find is after the
test in the mod_jk log we have about 20-30 lines of this:
[Mon Sep 17 08:03:49.906 2007] [7948:4868]
[error] jk_ajp_common.c (2097): (tomcat5) Connecting to tomcat failed.
Tomcat is probably not started or is listening on the wrong port
The lines vary only by the (tomcat5) being any
of the tomcats in the loadbalance.
It seems like apache/tomcat/mod_jk are reaching
the max number of connections between each other or something?
Any help would be GREATLY
appreciated,
--James
-----
Original Message -----
Sent:
Monday, September 17, 2007 9:12 AM
Subject:
Re: Load balancing question
Hello,
I cannot get my mod_jk status page to
work. Maybe it is because I am on windows?
It seems:
worker.list=jk-manage worker.jk-manage.type=status worker.jk-manage.mount=/admin/status/jk
only takes a linux style path for the mount?
We upgraded to the latest mod_jk and this
were the results:
1: All monitors were fine, there were no
bottlenecks anywhere that we could find(cpu's,HD's and networks all seemed
fine).
2: This time when we brought the servers to
their knees, they recovered a short time after the test was
completed.
3: We tried the socket_keepalive=true
for the workers and the server did not recover after
4: the only problem we can find is after the
test in the mod_jk log we have about 20-30 lines of this:
[Mon Sep 17 08:03:49.906 2007] [7948:4868]
[error] jk_ajp_common.c (2097): (tomcat5) Connecting to tomcat failed.
Tomcat is probably not started or is listening on the wrong
port
The lines vary only by the (tomcat5) being
any of the tomcats in the loadbalance.
It seems like apache/tomcat/mod_jk are
reaching the max number of connections between each other or
something?
Any help would be GREATLY
appreciated,
--James
-----
Original Message -----
Sent:
Saturday, September 15, 2007 5:17 AM
Subject:
Re: Load balancing question
What says your mod_jk status page ? try to monitor
during the load to see if your workers are in error or OK state, il the
max busy is reached,.... Then look at your logs (mod_jk, apache,
tomcat, webapps logs, windows,...)
As said before, you should
check the number of tcp connections opened. If your do not use keep
alive feature you can have a bootleneck there (apache and tomcat
servers).You can also have error like max opened file reached. Then
look at the load average,system cpu, iowait,..
You can also
monitor your tomcats through JMX (using jconsole or missioncontrol) to
check that garbage collections works fine and just don't hang up too
long.
try to deactivate the 2 tomcat instances on your apache
server to see if httpd is still available after the load test.
--
Bj
On 9/14/07, James
Sherwood <jsherwood@xxxxxxxxxxxxxxxx> wrote:
Hello,
Everything
is Windows2003 Server.
After the load we cannot load pages
either through apache or by contacting tomcat directly.
I
beleive you are on the right path tho, about connections not getting
released, thats what I figure it is too but I do not know how to
fix it.
Thanks, James
----- Original Message
----- From: "AFrieze" < AFrieze@xxxxxxxxxxxx
> To: <users@xxxxxxxxxxxxxxxx> Sent: Friday,
September 14, 2007 12:02 PM Subject: Re: Load
balancing question
> >> >> We also
have the problem of once the load stops, the sites are still
down >> but Apache/tomcats still seem to be running
fine. A restart of >> either(not even both) fixes
the sites. > A guess > > Your apache
server is not releasing connections. If you are
running > linux, type "netstat -vat" into a terminal
on your apache machine, before > and after you hit your
server. See if the connections are being released.
> > You could also try typing "ps -e | grep "httpd"" to
see how many apache > processes are being run
before/after. Look in the apache error log, etc. >
You might find a clue like "MaxClients reached" > >
Question > Are you able to log into all your tomcats(through
port 8080) independent > of apache and get served
requests? Can you log onto apache and get a >
statically served page? > > Cheers >
AFrieze > >
--------------------------------------------------------------------- >
The official User-To-User support forum of the Apache HTTP Server
Project. > See <URL: http://httpd.apache.org/userslist.html> for more
info. > To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
> " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx >
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx > > >
__________ NOD32 2529 (20070913) Information
__________ > > This message was checked by NOD32 antivirus
system. > http://www.eset.com
> >
--------------------------------------------------------------------- The
official User-To-User support forum of the Apache HTTP Server
Project. See <URL:
http://httpd.apache.org/userslist.html> for more info. To
unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
" from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For
additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx
__________
NOD32 2534 (20070917) Information __________
This
message was checked by NOD32 antivirus system. http://www.eset.com
|