Hi Ming.
Thanks for the quick response. However your patch doesn't fix the problem, it just moved it as shown in the Thread 1 stack backtrace below. Your patch stopped the PJSUA lock from being obtained in pjsua_call_on_state_changed()(frame 7), however, just a few frames later (frame 5), my application calls pjsua_call_on_state_changed() from on_call_state() where an attempt on the PJSUA lock is made and deadlocked.
I didn't include the stack back traces of the other tasks because they were essentially blocked the same as before. I admit I don't know this software well, but why was my initial analysis incorrect? Don't the locks always have to be locked in the same order?
Thanks for your help,
Dave
Thread 1 (Thread 0x7f1a3935d840 (LWP 27501)):
#0 in __lll_lock_wait () from /lib64/libpthread.so.0
#1 in _L_lock_839 () from /lib64/libpthread.so.0
#2 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 in pj_mutex_lock (mutex=0x4594238) at ../src/pj/os_core_unix.c:1265
#4 in PJSUA_LOCK () at ../include/pjsua-lib/pjsua_internal.h:575
#5 in pjsua_call_get_info (call_id=1091, info=0x7fff7ce7c610) at ../src/pjsua-lib/pjsua_call.c:1817
#6 in on_call_state (call_id=1091, e=0x7fff7ce7ceb0) at TERMmain.c:509
#7 in pjsua_call_on_state_changed (inv=0x7f1a05272688, e=0x7fff7ce7ceb0) at ../src/pjsua-lib/pjsua_call.c:3824
#8 in inv_set_state (inv=0x7f1a05272688, state=PJSIP_INV_STATE_DISCONNECTED, e=0x7fff7ce7ceb0) at ../src/pjsip-ua/sip_inv.c:317
#9 in inv_on_state_incoming (inv=0x7f1a05272688, e=0x7fff7ce7ceb0) at ../src/pjsip-ua/sip_inv.c:4310
#10 in mod_inv_on_tsx_state (tsx=0x7f1a05273608, e=0x7fff7ce7ceb0) at ../src/pjsip-ua/sip_inv.c:717
#11 in pjsip_dlg_on_tsx_state (dlg=0x7f1a05c55598, tsx=0x7f1a05273608, e=0x7fff7ce7ceb0) at ../src/pjsip/sip_dialog.c:2064
#12 in mod_ua_on_tsx_state (tsx=0x7f1a05273608, e=0x7fff7ce7ceb0) at ../src/pjsip/sip_ua_layer.c:178
#13 in tsx_set_state (tsx=0x7f1a05273608, state=PJSIP_TSX_STATE_COMPLETED, event_src_type=PJSIP_EVENT_TX_MSG, event_src=0x7f1a043c0c38, flag=0) at ../src/pjsip/sip_transaction.c:1235
#14 in tsx_on_state_proceeding_uas (tsx=0x7f1a05273608, event=0x7fff7ce7cfb0) at ../src/pjsip/sip_transaction.c:2819
#15 in pjsip_tsx_send_msg (tsx=0x7f1a05273608, tdata=0x7f1a043c0c38) at ../src/pjsip/sip_transaction.c:1751
#16 in pjsip_dlg_send_response (dlg=0x7f1a05c55598, tsx=0x7f1a05273608, tdata=0x7f1a043c0c38) at ../src/pjsip/sip_dialog.c:1529
#17 in pjsip_inv_send_msg (inv=0x7f1a05272688, tdata=0x7f1a043c0c38) at ../src/pjsip-ua/sip_inv.c:3227
#18 in pjsua_call_hangup (call_id=1091, code=606, reason=0x0, msg_data=0x0) at ../src/pjsua-lib/pjsua_call.c:2426
#19 in timer_callback (timer_heap=0x7f1a15cc85f8, timer=0xbf6e90 <dialer_info+3972080>) at TERMmain.c:313
#20 in pj_timer_heap_poll (ht=0x7f1a15cc85f8, next_delay=0x0) at ../src/pj/timer.c:643
#21 in TIMERpoll () at TIMERmain.c:72
#22 in main_func (argc=1, argv=0x7fff7ce7d788) at main.c:352
#23 in pj_run_app (main_func=0x40c6a7 <main_func>, argc=1, argv=0x7fff7ce7d788, flags=0) at ../src/pj/os_core_unix.c:1952
#24 in main (argc=1, argv=0x7fff7ce7d788) at main.c:782
On Mon, Mar 20, 2017 at 5:33 AM, Ming <ming@xxxxxxxxx> wrote:
Hi all,Thanks for the report and sorry for the delay in answering.For the problem reported by David, we think the problem is actually caused by thread 1 (instead of thread 3) which shouldn't try to obtain PJSUA's lock after holding the transaction lock. So we propose the fix attached. Please revert your temporary fix (where you remove the acquisition of group lock in pjsip_tsx_layer_find_tsx()) and use our patch instead, and share with us whether it rectifies the issue. Thanks.While for the deadlock issue reported by Alex, despite the similarity, I believe this is an unrelated issue, so I'll reply it separately in the original thread.Finally, for Janu, without the stack trace, I can't be certain that your problem is the same as one or both of these issues. So my suggestion is to apply both patches and see if the problem disappears. Also, I would recommend to upgrade to version 2.6, if you haven't, since 2.3 is already quite old.Best regards,Ming______________________________Also seems like the one I reprted (happens also if the application calls pjsua_call_hangup() between receiving a sip BYE and sending the corresponding OK).I also got no response to that.Regards,Thomas------ Originalnachricht------Von: Alex HermannDatum: Fr., 17. März 2017 09:49Cc:Betreff:Re: Interesting deadlock bug found causing three threads to deadlock on the PJSUA, UA, and transaction locks.On donderdag 16 maart 2017 14:18:25 CET David Richards wrote:> I found a bug that causes my application (using the PJSUA API) to deadlockThis looks like the same deadlock I reported on 7-3-2017. Unfortunately, I got no response to it.-- Alex Hermann_______________________________________________Visit our blog: http://blog.pjsip.orgpjsip mailing listpjsip@lists.pjsip.orghttp: //lists.pjsip.org/mailman/list info/pjsip_lists.pjsip.org _________________
Visit our blog: http://blog.pjsip.org
pjsip mailing list
pjsip@xxxxxxxxxxxxxxx
http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip. org
_______________________________________________
Visit our blog: http://blog.pjsip.org
pjsip mailing list
pjsip@xxxxxxxxxxxxxxx
http://lists.pjsip.org/mailman/listinfo/pjsip_lists. pjsip.org
_______________________________________________ Visit our blog: http://blog.pjsip.org pjsip mailing list pjsip@xxxxxxxxxxxxxxx http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org