Another STRANGE "feature/bug" of libss7 !

marcelo@xxxxxxxxxx (Marcelo Pacheco) · Mon, 03 Dec 2012 23:57:11 -0200

Just finished a DAHDI patch for 2.6.1 that keeps the performance 
advantages of DAHDI MTP2, while providing the foundation of a 
solid/robust/reliable MTP2 implementation in userland:

1 - DAHDI repeats message transmission up to X times, once the message 
is transmitted X times, the link starts transmitting flags, X=250 
repeats a FISU for about 200ms, then flags are transmitted, forcing the 
userland to prove it's alive
2 - Repeated duplicate messages received are omitted only up to Y times, 
for instance with Y=120 means aprox. 100ms worth of FISUs, so a FISU 
would be received every 100ms, avoiding the userland from thinking the 
link is alive when it could be dead
3 - When a DAHDI MTP2 link is configured/closed/asterisk dies, it 
automatically transmits LSSU SIOS (Out of Service), and LSSU SIOS are 
repeated overriding item 1 (no limit), today if asterisk crashes, the 
link will repeat the last message, most likely a FISU, which will fool 
the other side that the link is still alive, until the other side gets 
wiser !
4 - Instead of repeating MSU (MTP3 or L4 messages), convert that MSU 
about to be repeated into a FISU automatically, so there's no need to 
send a FISU to update the FSN/FIB
5 - Allows DAHDI MTP2 for dynamic spans, actually DAHDI MTP2 has the 
same dchan requirements, with a tiny change, DAHDI MTP2 could be used 
with any driver that has dchan mode. This allows testing MTP2 
implementations in a virtual machine or laptop without an E1/T1 DAHDI card.

This is incompatible with the current libss7 implementation on the 
transmit side (it never sends repeated FISU/LSSU), but that's a pretty 
simple patch (part of my STP patch already).

However it's a must to build a robust SS7 stack, that would pass a 
serious, through certification test.

Without item 2, OCTET counting mode of MTP2 is impossible to implement. 
With it, OCTET counting mode can be implemented with very little overhead.

Again, this isn't free, but I could be bribed to provide this + the 
chan_dahdi/libss7 patch that allows this to function.
But libss7 needs a lot of extra features to be proper, beginning with 
retransmissions.

Marcelo Pacheco

On 12/02/12 23:03, Marcelo Pacheco wrote:
> Since I developed the basic transport of MTP2 over UDP, I'm now able 
> to use iptables to cause MTP2 messages to be dropped and watch the 
> result, when the MTP2 is running over UDP.
>
> The MTP2 over UDP uses almost exactly the same functionality of DAHDI 
> MTP2 channels, almost exact same behavior, except for a 200ms FISU / 
> LSSU heartbeat (instead of continuous transmissions in dchan mode or 
> no repetition at all between Asterisk and DAHDI in MTP2 mode).
>
> This lead to a very disturbing discovery !
>
> If I fully block MTP2 messages between two Asterisk / libss7 
> endpoints, and cause MSUs to be sent by libss7 (but dropped by iptables):
> 1 - libss7 never times out, no periodic retransmissions (when you send 
> an MSU, you must wait X miliseconds, if an ACK isn't received, 
> retransmission must take place automatically), X should be 
> configurable (depending on link round trip time), after Y failed 
> retransmissions, the link must be faulted !
> 2 - since it never retransmits, it never faults the MTP2 link
> 3 - retransmission only happens when a new MSU gets sent/received
> 4 - Receipt of FISUs showing missed MSUs are ignored !!!!!
>
> I got an IAM sent followed by a REL (try to place the call, hangup, 
> without the other side ever receiving it, so no ACM/ANM/RLC backwards 
> at this time)
> I left iptables DROP rules for 10 minutes, no retransmissions.
> Drop the iptables rules
> The MTP2 over UDP implementation sends FISUs every 200ms, so proper 
> FISUs are being sent.
> The side with MSUs in the transmit queue is now receiving FISUs with 
> received BSN < send FSN (the MSUs were not received by the other 
> side), again, no retransmits performed
> Wait another 10 minutes, just in case
> Still no retransmissions, because no MSUs were sent (no ISUP or MTP3 
> traffic so no new MSUs either way)
> Add another MSU (initiate another call, another IAM), now 
> retransmissions happen.
>
> You might ask, isn't this the function of E1/T1/V.35 ALARMS ? Not 
> quite, 99% of the times you talk to an ITU STP belonging to another 
> carrier, the E1 circuit ends in a TDM switch belonging to the other 
> carrier and get a cross connect (DACS) to the STP. So if there's an 
> alarm between the other TDM switch and the STP, it doesn't affect the 
> E1 between the TDM switch and yourself:
>
> Physical circuit topology:
> STP <---> Other TDM switch <----> Asterisk (typically TS 1-15, 17-31 
> voice to the TDM switch, TS 16 DACS to the STP)
> The MTP2 link is between Asterisk and the STP, just one E0/DS0 is 
> switched across, so alarms don't carry through.
>
> While this might work if everything is OK, this is completely outside 
> proper MTP2 functionality, add this to the FACT that DAHDI MTP2 
> doesn't distinguish from (I'm receiving FISUs all the time from the 
> link is DEAD), and this is a recipe for headaches if the SS7 link goes 
> silent without an alarm (for instance the E1 is connected to other 
> switch that DACS connection to a PTS, and the E1 between the other 
> switch and the PTS fails), even if you have two links with completely 
> separate spans all the way, libss7 will queue MSUs on the failed link 
> instead of rerouting them to the link that's left alive !
>
> Not good.
>
> The only reason this doesn't bit you in the ass HARD is because when 
> you have libss7 on one side and a TDM switch / STP with a proper MTP2 
> implementation, the other side will fault the link, but still, this is 
> very sloppy implementation. This would never pass a complete 
> certification test.
>

-- 
Atenciosamente,

Marcelo Pacheco
M2J Comunica??es e Inform?tica
Fixo: (27)2222-8118 / (27)2233-2296
Vivo: (27)9964-5440
Claro: (62)9161-9047
MSN: marcelo at macp.eti.br
E-mail: marcelo at m2j.com.br