Re: pj::Call::onCallState() and similar callbacks should be documented as holding the PJSUA_LOCK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


PS here is the backtraces of our deadlock, from pjsip 2.7.1 on Linux/armhf:

Thread 18 "pjsip"
#0  0x76acadf4 in __lll_lock_wait (futex=futex@entry=0x6263c77c, private=0) at lowlevellock.c:46     # trying to acquire our application mutex
#1  0x76ac3c64 in __GI___pthread_mutex_lock    # trying to acquire our application mutex
#2  0x00228320 in ?? () # our application code callback
#3  0x000ffd7c in ?? () # our application code callback; PJSUA_LOCK is apparently held
#4  0x76fd0d34 in pj::Call::processStateChange (this=this@entry=0x6263c6c0, prm=...) at ../src/pjsua2/call.cpp:723
#5  0x76face6c in pj::Endpoint::on_call_state (call_id=<optimized out>, e=0x711fe50c) at ../src/pjsua2/endpoint.cpp:1005
#6  0x76eff200 in pjsua_call_on_state_changed (inv=0x722a612c, e=0x711fe50c) at ../src/pjsua-lib/pjsua_call.c:3812
#7  0x76ec192c in inv_set_state (inv=0x722a612c, state=<optimized out>, e=<optimized out>) at ../src/pjsip-ua/sip_inv.c:317
#8  0x76ebd67c in mod_inv_on_tsx_state (tsx=0x626fcb54, e=0x711fe50c) at ../src/pjsip-ua/sip_inv.c:717
#9  0x76e73aa0 in pjsip_dlg_on_tsx_state (dlg=0x6240bd9c, tsx=0x626fcb54, e=0x711fe50c) at ../src/pjsip/sip_dialog.c:2066
#10 0x76e6bce0 in tsx_set_state (tsx=tsx@entry=0x626fcb54, state=state@entry=PJSIP_TSX_STATE_PROCEEDING,
    event_src_type=event_src_type@entry=PJSIP_EVENT_TX_MSG, event_src=event_src@entry=0x624095f4, flag=flag@entry=0)
    at ../src/pjsip/sip_transaction.c:1267
#11 0x76e6e228 in tsx_on_state_proceeding_uas (tsx=0x626fcb54, event=<optimized out>) at ../src/pjsip/sip_transaction.c:2710
#12 0x76e70188 in pjsip_tsx_send_msg (tsx=tsx@entry=0x626fcb54, tdata=tdata@entry=0x624095f4) at ../src/pjsip/sip_transaction.c:1789
#13 0x76e732dc in pjsip_dlg_send_response (dlg=0x6240bd9c, tsx=0x626fcb54, tdata=tdata@entry=0x624095f4) at ../src/pjsip/sip_dialog.c:1531
#14 0x76ec16f8 in pjsip_inv_send_msg (inv=0x722a612c, tdata=0x624095f4) at ../src/pjsip-ua/sip_inv.c:3231
#15 0x76efb7f8 in pjsua_call_answer2 (call_id=2, opt=0x0, code=180, reason=0x0, msg_data=0x0) at ../src/pjsua-lib/pjsua_call.c:2317
#16 0x76efb94c in process_pending_call_answer (call=call@entry=0x76f3d3b8 <pjsua_var+30440>) at ../src/pjsua-lib/pjsua_call.c:978
#17 0x76efd280 in pjsua_call_on_incoming (rdata=rdata@entry=0x626efd4c) at ../src/pjsua-lib/pjsua_call.c:1666
#18 0x76f00698 in mod_pjsua_on_rx_request (rdata=<optimized out>) at ../src/pjsua-lib/pjsua_core.c:582
#19 0x76e569fc in pjsip_endpt_process_rx_data (endpt=0x983094, rdata=0x626efd4c, p=<optimized out>, p_handled=0x711fe8f4)
    at ../src/pjsip/sip_endpoint.c:893
#20 0x76e56bd4 in endpt_on_rx_msg (endpt=0x983094, status=<optimized out>, rdata=0x626efd4c) at ../src/pjsip/sip_endpoint.c:1043
#21 0x76e5dfa8 in pjsip_tpmgr_receive_packet (mgr=0x98e310, rdata=rdata@entry=0x626efd4c) at ../src/pjsip/sip_transport.c:1965
#22 0x76e60d68 in udp_on_read_complete (key=0x98b1f0, op_key=<optimized out>, bytes_read=1058) at ../src/pjsip/sip_transport_udp.c:170
#23 0x76cba7b4 in ioqueue_dispatch_read_event (ioqueue=<optimized out>, h=0x98b1f0) at ../src/pj/ioqueue_common_abs.c:605
#24 0x76cbc1c8 in pj_ioqueue_poll (ioqueue=0x98af40, timeout=timeout@entry=0x711fed88) at ../src/pj/ioqueue_epoll.c:812
#25 0x76e566fc in pjsip_endpt_handle_events2 (endpt=0x983094, max_timeout=max_timeout@entry=0x711fedc0, p_count=p_count@entry=0x711fedbc)
    at ../src/pjsip/sip_endpoint.c:744
#26 0x76f01704 in pjsua_handle_events (msec_timeout=<optimized out>) at ../src/pjsua-lib/pjsua_core.c:1988
#27 0x76f01744 in worker_thread (arg=<optimized out>) at ../src/pjsua-lib/pjsua_core.c:704
#28 0x76cbdbec in thread_main (param=0x9a93f4) at ../src/pj/os_core_unix.c:541
#29 0x76ac107c in start_thread (arg=0x711ff410) at pthread_create.c:339

Thread 22 "application"
#0  0x76acadf4 in __lll_lock_wait (futex=0x982c8c, private=0) at lowlevellock.c:46
#1  0x76ac3cc8 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x982c8c) at pthread_mutex_lock.c:116
#2  0x76cbd740 in pj_mutex_lock (mutex=0x982c8c) at ../src/pj/os_core_unix.c:1265
#3  0x76efa7e0 in PJSUA_LOCK () at ../include/pjsua-lib/pjsua_internal.h:580
#4  pjsua_call_get_info (call_id=2, info=info@entry=0x733fd2dc) at ../src/pjsua-lib/pjsua_call.c:1805
#5  0x76fd14ac in pj::Call::getInfo (this=0x6263c6c0) at ../src/pjsua2/call.cpp:408
#6  0x001c5cbc in ?? () # our application calling into PJSIP
#7  0x000f2588 in ?? () # holding our application mutex


On Wed, May 22, 2019 at 10:19 AM Martin Oberhuber <> wrote:
Dear PJSIP List / Maintainers,

As I've just been through debugging a deadlock in our application using Pjsip / pjsua2, I would like to suggest a documentation improvement, that should help other users run into the same trap as we did.

In our case, we've implemented some code in the pj::Call::onCallState() callback, that used a Mutex to modify our application state. Another thread, which took the same application state mutex, called out into PJSIP with the mutex held. It was an innocent call::getInfo() with our own Mutex held, but PJSIP deadlocked on this pattern in some, very rare, cases.

To make develoeprs aware, I would suggest improving the pj::Call::onCallState() documentation as follows:

virtual void pj::Call::onCallState(OnCallStateParam & prm)

Notify application when call state has changed. Application may then query the call info to get the detail call states by calling getInfo() function.

Note that the callback is made with the PJSUA_LOCK held. So the code should not take any resources that other Threads hold while calling into PJSIP.

prmCallback parameter.

References PJ_UNUSED_ARG.


What do you think?
Visit our blog:

pjsip mailing list

[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux