Proxy balancing weirdness with bybusyness

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey all,

We're having a strange problem with our 64-bit Apache 2.2.9 + bybusy patch proxy-balancing to mongrel app servers. What seems to happen is that Apache will forget (or ignore) workers that it knows about indefinitely. You can see this best from ps output:

deploy 27326 14.3 3.2 271728 130632 ? Sl 23:23 1:03 mongrel_rails [8000/0/289]: idle deploy 27329 15.4 3.7 298428 150368 ? Sl 23:23 1:08 mongrel_rails [8001/0/289]: idle deploy 27332 16.6 3.8 296292 154976 ? Sl 23:23 1:13 mongrel_rails [8002/0/288]: idle deploy 27335 15.5 3.3 279404 136820 ? Sl 23:23 1:08 mongrel_rails [8003/0/289]: idle deploy 27338 16.6 3.4 280396 139452 ? Sl 23:23 1:13 mongrel_rails [8004/0/290]: idle deploy 27341 13.6 3.3 275600 134724 ? Sl 23:23 1:00 mongrel_rails [8005/0/288]: idle deploy 27344 1.1 1.5 155708 62616 ? Sl 23:23 0:04 mongrel_rails [8006/0/7]: idle deploy 27347 16.2 3.7 299976 153908 ? Sl 23:23 1:11 mongrel_rails [8007/0/287]: idle deploy 27350 1.3 2.5 241708 104364 ? Sl 23:23 0:05 mongrel_rails [8008/0/5]: idle deploy 27354 1.4 2.6 246368 109044 ? Sl 23:23 0:06 mongrel_rails [8009/0/4]: idle deploy 27359 1.0 1.4 151124 58096 ? Sl 23:23 0:04 mongrel_rails [8010/0/0]: idle deploy 27362 0.9 1.4 151140 58112 ? Sl 23:23 0:04 mongrel_rails [8011/0/0]: idle

The format of the tuple in the mongrel-rails line is [port/pending/ handled] - 'pending' being mongrel's internal pending request cache, which should always be 0 or 1, and 'handled' being the number of requests that mongrel has handled up until now.

As you can see from the output, seven of the mongrel processes have served ~290 requests each, while five of them have served <10. This matches up with balancer-manager's status (taken from a few minutes later, so the numbers aren't the same):

Worker URL Route RouteRedir Factor Set Status Elected To From http://cimbar:8000 1 0 Ok 415 315K 22M http://cimbar:8001 1 0 Ok 416 324K 22M http://cimbar:8002 1 0 Ok 484 392K 27M http://cimbar:8003 1 0 Ok 483 381K 26M http://cimbar:8004 1 0 Ok 484 379K 26M http://cimbar:8005 1 0 Ok 484 374K 25M http://cimbar:8006 1 0 Ok 52 44K 2.6M http://cimbar:8007 1 0 Ok 608 474K 34M http://cimbar:8008 1 0 Ok 53 41K 2.6M http://cimbar:8009 1 0 Ok 53 43K 2.9M http://cimbar:8010 1 0 Ok 5 1.1K 6.6K http://cimbar:8011 1 0 Ok 7 1.2K 62K

My first guess was that they were being disabled, but balancer-manager says 'Ok'. Next, I looked at the logfiles to see if there was anything amiss, and that's when I found something very odd - lbstatus seems to be skewing itself pretty dramatically, but I can't tell why. For example, here are the last entries for lbstatus for each port:

dan@waterdeep:/var/log/apache2$ for port in {8000..8011}; do fgrep "bybusyness selected worker \"http://cimbar:${port}"; /tmp/ logfile | tail -n1; done [Thu Nov 13 23:32:39 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8000 " : busy 2 : lbstatus -1922 [Thu Nov 13 23:32:45 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8001 " : busy 2 : lbstatus -1910 [Thu Nov 13 23:34:24 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8002 " : busy 2 : lbstatus -2233 [Thu Nov 13 23:34:25 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8003 " : busy 2 : lbstatus -2236 [Thu Nov 13 23:34:23 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8004 " : busy 2 : lbstatus -2234 [Thu Nov 13 23:34:24 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8005 " : busy 2 : lbstatus -2236 [Thu Nov 13 23:32:45 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8006 " : busy 3 : lbstatus 2468 [Thu Nov 13 23:34:25 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8007 " : busy 1 : lbstatus -3444 [Thu Nov 13 23:33:54 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8008 " : busy 3 : lbstatus 2724 [Thu Nov 13 23:32:43 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8009 " : busy 3 : lbstatus 2459 [Thu Nov 13 23:32:39 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8010 " : busy 3 : lbstatus 2987 [Thu Nov 13 23:32:45 2008] [debug] mod_proxy_balancer.c(1173): proxy: bybusyness selected worker "http://cimbar:8011 " : busy 3 : lbstatus 2983

I'm wondering if the 'busy 3' notice is the reason why. I see in mod_proxy_balancer.c it is incremented in proxy_balancer_pre_request() and decremented in proxy_balancer_post_request() - is it possible that it's not being decremented properly? The code looks straightforward, but I'll have to have another look at it.

Another thought I had was that maybe Rainer Jung's comment on bug #45501 (the bybusy patch: https://issues.apache.org/bugzilla/show_bug.cgi?id=45501) applied to this situation as well. He refers to counters in mod_jk skewing negative for some reason under load on 64-bit machines.

Our balancer config is here, but everything seems pretty straightforward, so it should be working properly, I would imagine. We've got four mongrels running on the Apache machine as well for extra load and handling requests when mongrel is restarted on cimbar. The lbset configuration works great, but this was happening before we added that, maxattempts, and timeout to the configuration.

<Proxy balancer://nuperfume lbmethod=bybusyness maxattempts=3 timeout=5>
                BalancerMember http://127.0.0.1:8000 retry=2 lbset=1
                BalancerMember http://127.0.0.1:8001 retry=2 lbset=1
                BalancerMember http://127.0.0.1:8002 retry=2 lbset=1
                BalancerMember http://127.0.0.1:8003 retry=2 lbset=1

                BalancerMember http://cimbar:8000 retry=2 lbset=0
                BalancerMember http://cimbar:8001 retry=2 lbset=0
                BalancerMember http://cimbar:8002 retry=2 lbset=0
                BalancerMember http://cimbar:8003 retry=2 lbset=0
                BalancerMember http://cimbar:8004 retry=2 lbset=0
                BalancerMember http://cimbar:8005 retry=2 lbset=0
                BalancerMember http://cimbar:8006 retry=2 lbset=0
                BalancerMember http://cimbar:8007 retry=2 lbset=0
                BalancerMember http://cimbar:8008 retry=2 lbset=0
                BalancerMember http://cimbar:8009 retry=2 lbset=0
                BalancerMember http://cimbar:8010 retry=2 lbset=0
                BalancerMember http://cimbar:8011 retry=2 lbset=0
        </Proxy>

I've even gone so far as to bring over Apache 2.2.10's mod_proxy code directly and compile that in, and while it works fine, nothing changes.

Any ideas?

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx


[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux