Re: Uneven load distribution in Tomcat application servers proxy balanced in front end Apache httpd web server

Christopher Schultz <chris@xxxxxxxxxxxxxxxxxxxxxx> · Tue, 22 Dec 2015 12:20:03 -0500

Gaurav,

On 12/22/15 11:26 AM, Gaurav Kumar wrote:
> I am using 6 Apache httpd 2.2.15 which are forwarding requests to the
> Tomcat application servers (version: 7.0.41). Using mod_proxy, all the
> application servers are balanced with proxy balancers. Below is the
> similar configuration of apache httpd.conf:
> 
> |##Proxy Balancers for use by all Virtual Hosts <Proxy
> balancer://FrontEnd> BalancerMember ajp://APP01.abcd.com:8009
> <http://APP01.abcd.com:8009> route=APP01 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP02.abcd.com:8009
> <http://APP02.abcd.com:8009> route=APP02 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP03.abcd.com:8009
> <http://APP03.abcd.com:8009> route=APP03 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP04.abcd.com:8009
> <http://APP04.abcd.com:8009> route=APP04 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP05.abcd.com:8009
> <http://APP05.abcd.com:8009> route=APP05 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP06.abcd.com:8009
> <http://APP06.abcd.com:8009> route=APP06 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP07.abcd.com:8009
> <http://APP07.abcd.com:8009> route=APP07 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP08.abcd.com:8009
> <http://APP08.abcd.com:8009> route=APP08 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP09.abcd.com:8009
> <http://APP09.abcd.com:8009> route=APP09 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP10.abcd.com:8009
> <http://APP10.abcd.com:8009> route=APP10 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP11.abcd.com:8009
> <http://APP11.abcd.com:8009> route=APP11 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP12.abcd.com:8009
> <http://APP12.abcd.com:8009> route=APP12 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP13.abcd.com:8009
> <http://APP13.abcd.com:8009> route=APP13 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP14.abcd.com:8009
> <http://APP14.abcd.com:8009> route=APP14 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP15.abcd.com:8009
> <http://APP15.abcd.com:8009> route=APP15 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP16.abcd.com:8009
> <http://APP16.abcd.com:8009> route=APP16 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP21.abcd.com:8009
> <http://APP21.abcd.com:8009> route=APP21 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP22.abcd.com:8009
> <http://APP22.abcd.com:8009> route=APP22 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP23.abcd.com:8009
> <http://APP23.abcd.com:8009> route=APP23 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP24.abcd.com:8009
> <http://APP24.abcd.com:8009> route=APP24 timeout=120 ttl=600
> keepalive=On BalancerMember ajp://APP25.abcd.com:8009
> <http://APP25.abcd.com:8009> route=APP25 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP26.abcd.com:8009
> <http://APP26.abcd.com:8009> route=APP26 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP27.abcd.com:8009
> <http://APP27.abcd.com:8009> route=APP27 timeout=120 ttl=600
> keepalive=On BalancerMember ajp:// APP28.abcd.com:8009
> <http://APP28.abcd.com:8009> route=APP28 timeout=120 ttl=600
> keepalive=On ProxySet stickysession=JSESSIONID </Proxy> |
> 
> I am facing the uneven load distribution issue among the application
> servers when I check it from the Apache webserver balancer-manager.
> Infact, the top 13 app servers (app01 to app13, assume batch1) are
> getting almost equal load and the remaining app servers (app14 to app16
> and app21 to app28, assume batch2) are getting equal load. The batch1
> app servers have almost 3 times more load than the batch2 app servers.
> 
> I also tried to diagnose if any network issue might be causing the
> problem. So, tried the traceroute command for the diagnosis and found
> almost similar patterns with 30 hops for both the batch servers (batch1
> as well as batch2).
> 
> I am unable to figure out, what the issue is? Can anyone please help me
> out. Any help, really appreciated.

Remember that users can keep their session identifiers longer than their
session actually lasts. The load-balancer doesn't keep a mapping of
session ids to app servers.. instead, it uses a node-hint in the session
id like 8927418392BACBD3298.workerName and uses that for routing.

If you have a user who logs-in and is assigned to a node, then continues
using their browsers without restarting it for a month, they will always
go to the same server unless their session cookies are being destroyed
*and* then you redirect them to be re-balanced before another session is
created.

This might just be a natural "clumping" of sessions to various servers.
If you watch it over time, does it settle-down over time, or does it
appear that the nodes are becoming more and more segregated?

-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx