RE: Non-Responsive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think if you increase your ulimit to maybe 32768 that will probably prevent it unless you're processing some very high number of calls. Or Gnugk is some how not releasing unused sockets some where along the lines and this is exhausting the socket resources. But I don't think this is the case since I've had Gnugk run for months without restarting.

 

Another thing that can be done is to check the current total number of calls on the system when a call comes in, and check it against a value that one can create in the configuration section. If the current call total is higher then the configuration value set, then release the call. This part should not be very hard to implement.

 

For example:

 

In proxychannel.cxx under function:

 

 bool CallSignalSocket::OnSetup(Q931 &q931pdu, H225_Setup_UUIE &Setup, PString &in_rewrite_id, PString &out_rewrite_id)

 {

 

 

   .

   .

   .

 

   //*****You can modify this line of code

   //*****if ( !(useParent || RasSrv->AcceptUnregisteredCalls(fromIP)))

 

   //******Change to this line - By adding this to the end of the if statement RasSrv->CheckTotalCurrentCalls()

   //******Then you will have to implement the new function in RasSrv.cxx which checks the value of current calls

   //******and compares it to what you have put in the configuration file.

 

   //***Add new line

   Bool currentCallsExceeded = false;

   if (!(useParent || RasSrv->AcceptUnregisteredCalls(fromIP) || currentCallsExceeded=RasSrv->CheckTotalCurrentCalls()))

 

 

   {

      //****If here check currentCallsExceeded to pick correct release code.

      If(currentCallsExceeded)

      {

         //***Return back NoRouteToDestination or which ever release code you want to have.

        PTRACE(3, "Q931\tNo destination for unregistered call " << callid);

        authData.m_rejectCause = Q931::NoRouteToDestination;

        rejectCall = true;

      }

      else

{               

        PTRACE(3, "Q931\tReject unregistered call " << callid);

        authData.m_rejectCause = Q931::CallRejected;

        rejectCall = true;

      }

   }

   else

   {

      if (Setup.HasOptionalField(H225_Setup_UUIE::e_destCallSignalAddress))

            if (RasSrv->GetCallSignalAddress(fromIP) == Setup.m_destCallSignalAddress)

                  Setup.RemoveOptionalField(H225_Setup_UUIE::e_destCallSignalAddress);

 

                        if (H225_TransportAddress *dest = request.Process())

{

                              destFound = true;

                              calledAddr = *dest;

                             

      if (!useParent)

                                    useParent = request.GetFlags() & Routing::SetupRequest::e_toParent;

                        }

else

{

                              PTRACE(3, "Q931\tNo destination for unregistered call " << callid);

                              //FP REMOVED BY ME-authData.m_rejectReason = request.GetRejectReason();

                              authData.m_rejectCause = Q931::NoRouteToDestination;

                              rejectCall = true;

                        }

     }

 

 

  Gnugk already supports something very similar to this where you can re-direct the calls to another gatekeeper by configuring

  Gnugk to have RedirectGK = Calls > x. But this only works when you’re using RAS, before the initial setup message is sent by the endpoint.

 

  The hack I just put up will work when the setup message hits the gatekeeper. So it will work for endpoints that are sending RAS since eventually it  

  will send a setup message and endpoints that just send direct setup messages. But this will not redirect the call to another gatekeeper but only

  terminate the call with the release code that you decide to put.

 

  Another thing that can also be done is that instead of checking CheckTotalCurrentCalls() have it check the total current sockets in use.

 

  I hope this helps, these are some ideas that came to mine.

 

 

Freddy

 

 

-----Original Message-----
From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx [mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of R. Todd Wallace
Sent: Friday, February 25, 2005 11:21 AM
To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Subject: RE: Non-Responsive

 

Is there anything we can help with??  This has bit us hard a few times and

really need a solution.  We are willing to help out where we can...

 

 

R. Todd Wallace

 

 

-----Original Message-----

From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx

[mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of Jan

Willamowius

Sent: Thursday, February 24, 2005 3:05 AM

To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx

Subject: Re: Non-Responsive

 

Freddy, that is very good testing!

 

I'm currently busy with non-GnuGk projects, but it would be great if you

could pinpoint the place in the code where we are still missing a check for

a failed socket open based on your current test setup. We fixed a lot of

those moving from 2.2.0 to 2.2.1 but obviously not all and you should be

able to get a backtrace from your segfault.

 

After that we need a strategy how to deal with an out-of-socket situation

best. If we have only reached the limit per thread we might actually be able

to dynamically start another handler thread and be fine. But it could also

be a system limit we are reaching and then we'd need a strategy to handle

current calls and registrations and avoid new calls. Quite a bit of work,

but doable.

 

So short,

Jan

 

 

Freddy Parra wrote:

> Test on generating calls for Gnugk Version 2.2.2:

>

>

> The following test was performed:

>

>

> 1.)  I used screens to create a separate session on Linux and set

> ulimit to be 65 sockets for gnugk to run with.

>

> 2.)  I had signaling handlers set to 15 and with h225 routed and h245

> being tunneled. No media proxying.

>

> 3.)  Generated 3 sets of 10 calls each to Gnugk.

>

> 4.)  During the second set of initiated calls Gnugk displayed the

> following error:

>

>      

>

> Assertion fail: Operating System error, file tlibthrd.cxx, line 730,

> Error=24

>

>

> <A>bort, <C>ore dump, <I>gnore?

>

>

> 5.)  Once this error appear one could continue to send calls to the

> system, but no more Sockets were being created and released when

>

>      Checking the total sockets with lsof - p 29621 | wc -l.

>

>

> 6.)  I repeated this same test a few more times and the same error

> appeared.

>

>

> I then decided to perform the test on an old version of 2.2b5 that I

> had been using for many months before moving to 2.2.2

>

> The same errors occur as in 2.2.2. I also did not see any new sockets

> being created or released when issuing lsof - p 29621 | wc -l.

>

> The status port was blank with _ characters appearing.

>

>

> I used the following Call Generator command.

>

>

> callgen323 -m 10 -r 10 --tmaxest 5 --tmincall 3 --tmaxcall 3

> --tminwait 1 --tmaxwait 1 -u gen2 -g xxx.xxx.xxx.xxx CISCO5850E

>

>

>

>

>

> ________________________________

>

> From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx

> [mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of

> Freddy Parra

> Sent: Wednesday, February 23, 2005 11:04 AM

> To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx

> Subject: RE: Non-Responsive

>

>

> Here are some current results:

>

>

> When max sockets are reached the following behavior was noted:

>

>

> For this test I had the following conditions set:

>

>

> 1.) I used screens to create a separate session window on Linux and

> set ulimit to be 60 sockets for gnugk to run with.

>

> 2.) I had signaling handlers set to 15 and with h225 routed and h245

> being tunneled. No media proxying.

>

> 3.) I was able to create 3 telnet instances to the status port and on

> the 4th instance I got a blank telnet response. This meant I ran out

> of sockets

>

>     which is the state I wanted gnugk to be in.

>

> 4.) When I issued a reload on one of the telnet sessions all endpoints

> that were registered suddenly unregistered.

>

> 5.) I retested the same case scenario and this time had a segmentation

> fault (core dump).

>

> 6.) I performed the following test a few times over and the same

> scenarios played out, with endpoints un-registering and segmentation

> fault happening.

>

>

>

> I'm in the process of creating some test with the call generator and

> see what the behavior is when calls are being sent to the system when

> max sockets are reached.

>

>

> Freddy

>

>

> -----Original Message-----

> From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx

> [mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of

> Freddy Parra

> Sent: Tuesday, February 22, 2005 6:53 PM

> To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx

> Subject: RE: Non-Responsive

>

>

> Hi Todd,

>

>

> That's interesting. I was thinking of maybe running the call generator

>

> program that comes with openh323 and setting the ulimit for total

>

> sockets on gnugk to a very small number like 20, then generating like

> 50

>

> or more calls at the same time using the call generator. Maybe like

> this

>

> we can duplicate this behavior and see if this is the actual problem.

> I

>

> will try later on today if I find something and post my findings.

>

>

> Freddy

>

>

>

>

> -----Original Message-----

>

> From: openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx

>

> [mailto:openh323gk-users-admin@xxxxxxxxxxxxxxxxxxxxx] On Behalf Of R.

>

> Todd Wallace

>

> Sent: Tuesday, February 22, 2005 5:55 PM

>

> To: openh323gk-users@xxxxxxxxxxxxxxxxxxxxx

>

> Subject: RE: Non-Responsive

>

>

> Something that was noticed was that the GNU had a ton of sockets in

> wait

>

> state.  It looks like there was a blip in IP and these sockets did not

>

> get

>

> released and the GNU had them in wait state. 

>

>

>

>

> -------------------------------------------------------

>

> SF email is sponsored by - The IT Product Guide

>

> Read honest & candid reviews on hundreds of IT Products from real

> users.

>

> Discover which products truly live up to the hype. Start reading now.

>

> http://ads.osdn.com/?ad_ide95&alloc_id396&op=ick

>

>

> _______________________________________________________

>

>

> List: Openh323gk-users@xxxxxxxxxxxxxxxxxxxxx

>

> Archive: http://sourceforge.net/mailarchive/forum.php?forum_id...49

>

> Homepage: http://www.gnugk.org/

>

>

 

 

--

Jan Willamowius, jan@xxxxxxxxxxxxxx, http://www.willamowius.de/

 

 

-------------------------------------------------------

SF email is sponsored by - The IT Product Guide

Read honest & candid reviews on hundreds of IT Products from real users.

Discover which products truly live up to the hype. Start reading now.

http://ads.osdn.com/?ad_ide95&alloc_id396&op=ick

 

_______________________________________________________

 

List: Openh323gk-users@xxxxxxxxxxxxxxxxxxxxx

Archive: http://sourceforge.net/mailarchive/forum.php?forum_id.49

Homepage: http://www.gnugk.org/

 

 

 

 

 

-------------------------------------------------------

SF email is sponsored by - The IT Product Guide

Read honest & candid reviews on hundreds of IT Products from real users.

Discover which products truly live up to the hype. Start reading now.

http://ads.osdn.com/?ad_ide95&alloc_id396&op=ick

 

_______________________________________________________

 

List: Openh323gk-users@xxxxxxxxxxxxxxxxxxxxx

Archive: http://sourceforge.net/mailarchive/forum.php?forum_id…49

Homepage: http://www.gnugk.org/


[Index of Archives]     [SIP]     [Open H.323]     [Gnu Gatekeeper]     [Asterisk PBX]     [ISDN Cause Codes]     [Yosemite News]

  Powered by Linux