Hi, On 2013-06-27 07:48, Pavel Troller wrote: > Hi Kaloyan, > > Hi all, > sorry for joining so late, but i am on holidays (by the end of the > week) > and rarely checking my mailbox. Thanks to bad weather i did that today > :) > > Never mind, I'm happy you're here! > > > To the OP: > while reading the first posts i thought it is an old problem with > REL/RSC > loop (persistent on start with ANSI signaling) which was fixed in > libss7 > instead of sig_ss7, but not sure if it is a similar yet different one > or it > is the same issue. It really is a (remaining) problem if we receive RLC > on > previous REL, but after we have sent RSC. I was thinking to clear the > old > status bits after we receive RLC, but this will not fix the double RLC > received problem and we can't ignore the first one (or just clear the > SENT_REL flag), because we may never get a second one, so it should > probably be better to ignore sending second RSC inside > isup_handle_unexpected() if the previous one was sent T17 (timer > seconds) > ago. Because the timer is stopped on RLC it should be another timer or > some > flag to ignore it's expiration and not reset again ... will work on > this > next week when i am back. > > I think it's another problem. Sometimes I have also this kind of loop, > lasting > for hours, until it somewhat settles itself. But the error I've > reported here > is, that we clear the old status flags immediately after sending our > REL and > if an MSU is already coming back (it may be any common MSU like ACM, > CPG, ANM, > SUS, RES, REL..., at least I've encountered all these), we don't expect > it, > we call isup_handle_unexpected() and we send RSC, which is absolutely > surplus, > because there is nothing wrong with the call state, we just have to > ignore > this (and possibly any other) MSUs, until we get RLC acknowledging our > REL. > My patch does it by checking ISUP_SENT_REL, however, it might be better > to > postpone clearing the got_sent_msg flags from isup_rel() to the > ISUP_RLC case > in isup_receive(). However, I didn't know, whether leaving these flags > set after > sending REL wouldn't make harm somewhere, so I did it as written, and > about 300 > thousands of calls during yesterday didn't discover any problem with > the patch. > So, today I removed the ss7_message() calls from my patch and since > then, > Asterisk is very quiet and seems very happy, and cooperating EWSDs as > well :-). > I have just uploaded a new version to review 2150, which actually ignores unexpected messages when we are waiting for RLC and have the relevant timers (ISUP_T1, T5, T16 and T17) as it is a bit risky to ignore them otherwise - we may never get RLC on our REL, while timers will guarantee that we will resend it or send RSC in this case. > With regards, > Pavel > > > The code in my branch is actually Domjan Attila's version (the patches > attached to the SS7-27 issue) ported to later Asterisk versions with > very > few additions/modifications, so the muffins are for him, while the bugs > are > from me :) > > P.S. > apologies for top posting - the connection is unstable and i had to > write > the post offline and just copy/paste it > > On 2013-06-26 06:42, Pavel Troller wrote: > Hi! > So, I'm replying to my own original post, to keep the question and a > possible answer together without any excessive or unrelated > information. > I hope I've found the cause of the problem and I hope I solved it. A > modified libss7 is now online and I'm waiting for busy hours to see, > whether > it will help. > The problem is, that in the isup_rel() function, all the important > got_sent_msg flags are cleared, so the stack "forgets" a preceding call > state: > ... isup_rel(): > c->got_sent_msg |= ISUP_SENT_REL; > c->got_sent_msg &= ~(ISUP_SENT_IAM | ISUP_PENDING_IAM | > ISUP_CALL_CONNECTED | ISUP_GOT_IAM | ISUP_GOT_CCR | ISUP_SENT_INR); > ... > So, an incoming MSU, which was perfectly legitimate before sending REL, > is now handled as unexpected. > My solution adds the following code to the isup_receive() function for > every message, which can confuse the stack by the discovered cause > (an example for ACM message): > case ISUP_ACM: > + if (c->got_sent_msg & ISUP_SENT_REL) { > + ss7_message(ss7, "Got unexpected ACM > after sending REL on CIC %d PC %d, ignoring ", c->cic, opc); > + return 0; > + } > > if (!(c->got_sent_msg & ISUP_SENT_IAM)) { > ss7_message(ss7, "Got ACM but we didn't send IAM on CIC %d PC %d ", > c->cic, opc); > return isup_handle_unexpected(ss7, c, opc); > } > > If my change will prove good, I'm planning to remove the ss7_message() > to > limit the stack verbosity, as these situations are relatively frequent > under > heavy load and I think they are moreless logical and normal. > > I would be glad for some words from the KNK branch maintainer(s), > whether > to > create a JIRA issue and put my patch there or how to proceed now in > general. > > With regards, > Pavel > > > > Hi! > I would like to share my expiernce with deployment of this experimental > SS7 > branch. > The first impressions are good, especially the timers seem to work > well, > saving many calls from being frozen. > However, there are still some strange things, which I would like to > discuss > here, one by one. > The first one is, that the channel sometimes doesn't recognize a > message > (mostly RLC), even it comes from an action initiated by the channel > itself. > Typically, the following is appearing often: > > [Jun 24 13:33:41] ERROR[3975]: chan_dahdi.c:14406 dahdi_ss7_error: [1] > ISUP timer t17 expired on CIC 27 DPC 4097 > [1] Got RLC but we didn't send REL/RSC on CIC 27 PC 4097 reseting the > cic > > As I understand, there were some timeouts and now the channel tries to > recover by sending RSC and firing T17. However, it seems that it > immediately > rejects RLC, which comes back as a response to the RSC which was just > sent > upon expiry of T17. And this appears again and again in the rhythm of > T17, > and the channel is not operational. > ss7 show calls shows the following line for the misbehaving CIC: > 27 4097 11 IAM IAM > > Or, a very similar situation: > [2] Got SUS but no call on CIC 48 PC 4096 reseting the CIC > [2] Got RLC but we didn't send REL/RSC on CIC 48 PC 4096 reseting the > CIC > > The first question is, why there was no call while SUS was received. My > idea is, that both the parties hung up their phones in the same time > and > that the call was undergoing destruction on Asterisk side (REL just > sent > or something like this), while SUS arrived. Maybe the call was marked > as > cleared even before RLC came back ? OK, I can understand this. But > if the CIC was reset as the first message says (i.e. RSC was sent), why > the > RLC going back is not recognized then ? > > Or, just now the following appeared: > > [1] Got ACM but we didn't send IAM on CIC 10 PC 4097 reseting the cic > [1] Got RLC but we didn't send REL/RSC on CIC 10 PC 4097 reseting the > cic > > Again, it's questionable, why this happened, but the second line seems > to indicate some brokeness again. > > To explain: The channel is operating on a gateway equipped with 16 E1s > and current traffic is about 10 CAPS, there are two linksets to two > cooperating exchanges. They are EWSDs, which have very mature and > stable > SS7, so I'm almost sure that they are not making signalling errors. > > With regards, > Pavel > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-ss7 mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-ss7 > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-ss7 mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-ss7 > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-ss7 mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-ss7 > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > asterisk-ss7 mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-ss7