Help in understanding cascading problem when using 5 MCUs and GnuGk

pierlu <pierlu@xxxxxxxxx> · Tue, 15 Nov 2011 17:44:45 +0100

Hi everybody.

I have to explain a bit before asking my question. 

This is the situation
When I'm doing large videoconferences I have to cacade more tha one MCU. These are the 5 MCUs I use:

A: Radvision  viaIp 400 20 ports 
B: Radvision Scopia 24 ports
C: Radvision Scopia 24 ports
D: Radvision Scopia 12 ports 
E: Tandberg 16 ports 

Services are set to 4CIF@386kbps max. MCUs B, C and D accepts also CIF@384kbps and 4CIF@256kbps  

Cascading is done this way D -> B <- A -> C <- E ( -> indicates the direction of the calls). So A is the master MCUs. Rooms are created by dialing in. 

This is the problem:
It happens seldomly, with no apparent trigger event, that during the conference MCUs B and C will completely fail. Clients connected to one of this MCUs will experience the following situation:

1) first the audio will stop working. is lie conference is muted: this is the sign for me that the MCU is about to go down. 
2) video will work nicely up until the call completely drops. 
3) clients trying to dial in the room again  will get a denial and if me, who's managing the MCUs, tries to recreate the room, gets the message "Resource Unavalaible" from the MCUs boards, just like all of the usable ports are used up, but actually no one is connected cos connections are being refused (I cannot say whether they are refused by MCUs or by GnuGk nor I can say whether this happens on LRQs or ARQs, my fault but the log is too big to tell, read further). 

This is what I don't understand 
Why do I have to unregister the MCU from the GnuGk with the unregisterip command via telnet interface to regain a full functional MCU? Only by doing so, media channels are newly avalaible and the MCUs will accept incoming calls. Of course the first times it happened I hard reset the MCUs to regain functionality, but in the end I discovered that its unregistration from the gatekeeper that solves the problem (btw TTL is set on 60 seconds). 

This is my question: do you think that, given the symptoms, fault is due to the interaction of GnuGk and the two MCUs or that is to blame only on the MCUs'? 
Consider that those two MCUs do not have any problems when handling conferences on their own (without being cascaded, I mean). 

I tried to debug GnuGk log, but I'm using trace level 5 and when cascading the log reports events for 80 clients and so it's pretty unreadable. Moreover, I have no clue to what to look for. 

Next Monday I have another of this "big" videoconferences... do you have any suggestion on what to look if this happens again? (as I told you, this does not happen on a regular basis, it may not happen at all). 

Excuse the long explanation, and the lack of GnuGk configuration to go along it, but I can provide them if you think you have a clue of what may be going on. 

Thank you for your attention.

Pietro Luigi Angelucci. 
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________________

Posting: mailto:Openh323gk-users@xxxxxxxxxxxxxxxxxxxxx
Archive: http://sourceforge.net/mailarchive/forum.php?forum_name=openh323gk-users
Unsubscribe: http://lists.sourceforge.net/lists/listinfo/openh323gk-users
Homepage: http://www.gnugk.org/