bug report: pjsip/pjsua deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 16, 2010 at 7:50 AM, Klaus Kuehnhammer <klaus at parq.net> wrote:
> retry, post never made it to the list
>
> From: Klaus Kuehnhammer <klaus@xxxxxxxxxxx>
> Date: 15. Februar 2010 18:20:48 MEZ
> To: pjsip list <pjsip at lists.pjsip.org>
> Cc: Andreas Barta <a.barta at centersystems.com>, Andreas Wehrmann
> <a.wehrmann at centersystems.com>
> Subject: bug report: pjsip/pjsua deadlock
>
> Hi!
>
> We've been observing occasional deadlocks in pjsua. They seem to be
> triggered when a call is made, and then almost immediately ended at about
> the same time the 200 reply from the called UA is just being processed.
>

Thanks for the info. But if the deadlock ended, it's not really a
deadlock, is it?

> Analysis of pjsua (vanilla PC build, version 1.5.5) with helgrind points to
> a number of incoherent lock acquisition orders for the pjsua, dialog and
> transaction mutexes.
>

We are aware that *some* mutexes are not acquried in uniform order
(and dare I say it is "by design), but we have put deadlock detections
in couple of places [1]. If you have more specific info on what
deadlock situation you're seeing then I could probably help further.

In the meantime, please have a look at this if you haven't:
http://trac.pjsip.org/repos/wiki/FAQ#sip-deadlock

Cheers
 Benny

[1] Search for "deadlock" in sip_ua_layer.c and pjsua_call.c

> * call_make_call locks the pjsua mutex first, then the transaction mutex,
> and finally the dialog
> * incoming sip messages lock the transaction first, and then the pjsua mutex
> in the pjsua_call_on_tsx_state_changed ?callback
>
> * pjsua_call_hangup locks the dialog, then the transaction
> * incoming and timer-triggered messages lock the transaction, then the
> dialog
>
> There are more errors logged by helgrind (file is attached), but these seem
> to me the possible culprits in the deadlock case we've seen.
>
> The way mutexes are used in the dialogs and transactions is not immediately
> intuitive (to me at least), so I can't really offer a quick fix for this.
>
> It's definitely a real problem though, so we thought we'd let people know...
> if anyone has an idea how to best go about resolving this, please let us
> know!
>
> Best regards,
> Klaus
>
>
>
> --
> Klaus Kuehnhammer
> Bitstem Software
> Wasnergasse 11/5
> 1200 Wien, Austria
> +43 664 2133466
> klaus at bitstem.com
>
>
>
>
> _______________________________________________
> Visit our blog: http://blog.pjsip.org
>
> pjsip mailing list
> pjsip at lists.pjsip.org
> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
>
>



-- 
Best regards,

 Benny



[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux