bug report: pjsip/pjsua deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 18.02.2010, at 16:38, Benny Prijono <bennylp at teluu.com> wrote:

> On Tue, Feb 16, 2010 at 7:50 AM, Klaus Kuehnhammer <klaus at parq.net>  
> wrote:
>> retry, post never made it to the list
>>
>> From: Klaus Kuehnhammer <klaus@xxxxxxxxxxx>
>> Date: 15. Februar 2010 18:20:48 MEZ
>> To: pjsip list <pjsip at lists.pjsip.org>
>> Cc: Andreas Barta <a.barta at centersystems.com>, Andreas Wehrmann
>> <a.wehrmann at centersystems.com>
>> Subject: bug report: pjsip/pjsua deadlock
>>
>> Hi!
>>
>> We've been observing occasional deadlocks in pjsua. They seem to be
>> triggered when a call is made, and then almost immediately ended at  
>> about
>> the same time the 200 reply from the called UA is just being  
>> processed.
>>
>
> Thanks for the info. But if the deadlock ended, it's not really a
> deadlock, is it?

Sorry, that was phrased a bit awkwardly. The deadlock does not end -  
the application has to make a call and almost immediately end that  
call again. If the application ends the call at the same time the 200  
reply comes in from the network, a deadlock occurs. The application  
never recovers.

My explanation is that the application thread, through its calls to  
pjsua, locks the pjsua mutex, and then tries to lock the dialog or  
transaction. But that's already been taken by the sip worker thread  
(reacting to the 200 reply), which in turn tries to get the pjsua  
mutex (in the state change callback) and can't.

The exact locations where the locks are taken are in the helgrind log.

>
>> Analysis of pjsua (vanilla PC build, version 1.5.5) with helgrind  
>> points to
>> a number of incoherent lock acquisition orders for the pjsua,  
>> dialog and
>> transaction mutexes.
>>
>
> We are aware that *some* mutexes are not acquried in uniform order
> (and dare I say it is "by design), but we have put deadlock detections
> in couple of places [1]. If you have more specific info on what
> deadlock situation you're seeing then I could probably help further.

Please see above.

>
> In the meantime, please have a look at this if you haven't:
> http://trac.pjsip.org/repos/wiki/FAQ#sip-deadlock
>
> Cheers
> Benny

Thanks, klaus
>
> [1] Search for "deadlock" in sip_ua_layer.c and pjsua_call.c
>
>> * call_make_call locks the pjsua mutex first, then the transaction  
>> mutex,
>> and finally the dialog
>> * incoming sip messages lock the transaction first, and then the  
>> pjsua mutex
>> in the pjsua_call_on_tsx_state_changed  callback
>>
>> * pjsua_call_hangup locks the dialog, then the transaction
>> * incoming and timer-triggered messages lock the transaction, then  
>> the
>> dialog
>>
>> There are more errors logged by helgrind (file is attached), but  
>> these seem
>> to me the possible culprits in the deadlock case we've seen.
>>
>> The way mutexes are used in the dialogs and transactions is not  
>> immediately
>> intuitive (to me at least), so I can't really offer a quick fix for  
>> this.
>>
>> It's definitely a real problem though, so we thought we'd let  
>> people know...
>> if anyone has an idea how to best go about resolving this, please  
>> let us
>> know!
>>
>> Best regards,
>> Klaus
>>
>>
>>
>> --
>> Klaus Kuehnhammer
>> Bitstem Software
>> Wasnergasse 11/5
>> 1200 Wien, Austria
>> +43 664 2133466
>> klaus at bitstem.com
>>
>>
>>
>>
>> _______________________________________________
>> Visit our blog: http://blog.pjsip.org
>>
>> pjsip mailing list
>> pjsip at lists.pjsip.org
>> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
>>
>>
>
>
>
> -- 
> Best regards,
>
> Benny
>
> _______________________________________________
> Visit our blog: http://blog.pjsip.org
>
> pjsip mailing list
> pjsip at lists.pjsip.org
> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org



[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux