RE: Non-Responsive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We use OpenNMS to monitor, but Nagios looks like an interesting product.
The restart is a bit more complex as the process has to be killed with a
"-9" as it does not die easy...


R. Todd Wallace
CTO
Touchstone Systems, Inc.
http://www.tstoneinc.com
+1 214-764-9301 x101

-----Original Message-----
From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx
[mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of
Zygmuntowicz Michal
Sent: Wednesday, February 09, 2005 11:51 AM
To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Subject: Re:  Non-Responsive

Maybe there is some tool that tells in which kernel parts the gatekeeper is
sitting when becoming non-responsive.

BTW: You can use Nagios or a similar tool to collect
           various data from the system and find a coincidence
           between events. You can even monitor status port
           responsiveness and restart the gatekeeper if necessary.

I am working on a better resource control, but this will definitelly take
some time, as the task is complex.

----- Original Message -----
From: "R. Todd Wallace" <rwallace@xxxxxxxxxxxxx>
Sent: Wednesday, February 09, 2005 6:39 PM


We have done some more research and it seems that Radius is idle when the
GNU becomes non-responsive.  We can issue requests to radius and it responds
which seems to take Radius out of the picture.  What might be a problem is
that the GNU cannot free up a socket for a radius request.  I can't tell if
we are running out of Processor Cycles, Memory or sockets.  We have ULIMIT's
set, FD's set and everything looks typical of systems mentioned.  We have a
ton of traffic on these and do hit low asr periods on particular routes.
When it becomes non-responsive, I can hit the status port, but it just gives
a blank screen like it is waiting on response.  I enter commands and they
just echo on the screen.  I am not sure if the GNU is aware of its
resources, but it sure would be nice if it would just reject requests until
that resource is freed up.   A patch as you suggested would be great!

R. Todd Wallace
-----Original Message-----
From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx
[mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Freddy
Parra
Sent: Tuesday, February 08, 2005 11:35 AM
To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Subject: RE:  Non-Responsive

Try increasing the number of socket descriptors that your current Gnugk can
use. Maybe the number of sockets is the problem as Michal stated, but I
think that if it is then Gnugk should be patched so that it can recover from
the lack of socket resources instead of the entire process just becoming
non-responsive.

Freddy

-----Original Message-----
From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx
[mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of R.
Todd Wallace
Sent: Tuesday, February 08, 2005 12:00 PM
To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Subject: RE:  Non-Responsive


We have gone for a log time without any problems and then we locked up 3
times last night.  Same customer, same traffic.  The only difference is that
we had more call attempts at a lower ASR.

R. Todd Wallace

-----Original Message-----
From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx
[mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Freddy
Parra
Sent: Tuesday, February 08, 2005 10:51 AM
To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Subject: RE:  Non-Responsive

Actually, I have run into this problem before on 2.2b5. It hasn't happen to
me yet on 2.2.2. When this happens nothing is responsive. When you check the
cpu usage for the Gnugk process it stays at zero, and the calls begin to
drop while all incoming calls never get connected. Also this has happen to
me with about 500 calls in h225 routed mode (with
tunneling) only. I usually run 3 to 4 times more calls on Gnugk per server.
So I don't think it's a socket issue or a cpu issue since cpu usage in
routed mode its relatively low when I'm running 1500 to 2000 calls on my
servers I'm at about 24 percent.

I believe I posted this same problem a few months back. This issue doesn't
happen very often. I once noticed that it did happen to me when I issued a
debug trace on the status port, the system became non-responsive like the
symptoms just described.

Freddy

-----Original Message-----
From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx
[mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of
Zygmuntowicz Michal
Sent: Tuesday, February 08, 2005 11:25 AM
To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Subject: Re:  Non-Responsive

Maybe you're running out of sockets or cpu usage is too high.
Check port ranges and current socket usage and per process socket limits
(netstat, ulimit).
If you're using acct/auth, make sure your backend is not a bottleneck.

----- Original Message -----
From: "R. Todd Wallace" <rwallace@xxxxxxxxxxxxx>
Sent: Tuesday, February 08, 2005 5:01 PM


> We are having issues with the GNU becoming non-responsive.  The
service is
> still active, but you can't go to the status ports and calls start
> declining.  Nothing is written to the gatkeeper.log file.  If you stop
the
> service, it will not die and you have to kill with a "-9".  We have
the
> GNU's compiled with debug, but a core is not generated.  We are
running
> the
> latest CVS and doing full media.  This seems to come with a very heavy

> load
> / call charge rate.  The only thing I see is that it scrolls the
following
> message across the screen to a call sticks:
>
> 2005/02/07 23:19:03.299 3       ProxyChannel.cxx(2757)  RTCP
> xxx.xxx.xxx.xxx:xxxx socket has no destination address yet, flush
ignored
>
> I think this is normal, but not sure if we are having problems with
> sockets not freeing up or not freeing fast enough.
>
> Any Thoughts??
>
> R. Todd Wallace



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

_______________________________________________________

List: Openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Archive: http://sourceforge.net/mailarchive/forum.php?forum_id=8549
Homepage: http://www.gnugk.org/





-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

_______________________________________________________

List: Openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Archive: http://sourceforge.net/mailarchive/forum.php?forum_id=8549
Homepage: http://www.gnugk.org/

[Index of Archives]     [SIP]     [Open H.323]     [Gnu Gatekeeper]     [Asterisk PBX]     [ISDN Cause Codes]     [Yosemite News]

  Powered by Linux