Yea I think I might have been there done that on this issue. Question: when you do a play/record/conference add/remove do you do it from a pjsip callback function? If so I think that is happening is if the conference bring is processing a frame while the callback happens the conf mutex is locked which causes the pjsua_ function to lock the pjsua mutex and then block on the conf mutex. Solution: in the callback record data, like ANI, DNIS, conf_slot, etc. But do the pjsua_ calls in a seperate thread. David Clark At 10:02 AM 6/14/2009, Nguyen Van Minh Danh wrote: >I am using PJSIP 1.1 to write an IVR application. The application >would dial a list of outgoing number, play a pre-recorded wav file >and ask the user to press a DTMF to confirm. The application works >fine for small number of concurrent calls (<5), but with a high >number of concurrent calls (10 and above), there is a deadlock after >a certain number of calls (the exact number could vary but is around >500 on average). > >In all deadlock situations, the behaviour causing the deadlock is >always the same. Strangely, repeating the behaviour on its own does >not replicate the problem. The steps leading to the deadlock is as follows: > >1. The program dials a number, the phone rings and is picked up by >the remote user. The application's on_call_state fires and the call >is in the 'CONFIRMED' state, pjsua_call_info.state = PJSIP_INV_STATE_CONFIRMED. >2. My program begins to transmit the WAV file to the above call >which was picked up. >3. Shortly after the WAV playback is done. another on_call_state >event takes places. The state of the call is PJSIP_INV_STATE_EARLY >(Session Progress), which indicates that the remote phone is ringing. >4. A deadlock happens at this stage, and shortly afterwards, any >PJSIP command will fail with "Timed-out trying to acquire PJSUA mutex". > >Some analysis using Visual Studio reveals that the deadlock is >caused by the following 3 threads > >*****Thread 1 (conference.c)****** > >PJ_DEF(pj_status_t) pjmedia_conf_remove_port( pjmedia_conf *conf, > unsigned port ) >{ >......... > pj_mutex_lock(conf->mutex); >........ >} > >****Calling stack for Thread 1**** > > pj_mutex_lock(pj_mutex_t * mutex=0x04a692ac) Line 911 + 0xc bytes C > pjmedia_conf_remove_port(pjmedia_conf * conf=0x033229fc, unsigned > int port=3) Line 1069 + 0xc bytes C > pjsua_conf_remove_port(int id=3) Line 1631 + 0x10 bytes C > stop_media_session(int call_id=72) Line 1230 + 0xc bytes C > pjsua_media_channel_deinit(int call_id=72) Line 1271 + 0x9 bytes C > pjsua_call_on_state_changed(pjsip_inv_session * inv=0x05cd3e28, > pjsip_event * e=0x05a1058c) Line 2970 + 0xb bytes C > inv_set_state(pjsip_inv_session * inv=0x05cd3e28, pjsip_inv_state > state=PJSIP_INV_STATE_DISCONNECTED, pjsip_event * > e=0x05a1058c) Line 219 + 0x10 bytes C > handle_uac_call_rejection(pjsip_inv_session * inv=0x05cd3e28, > pjsip_event * e=0x05a1058c) Line 3020 + 0xf bytes C > inv_on_state_early(pjsip_inv_session * inv=0x05cd3e28, > pjsip_event * e=0x05a1058c) Line 3256 + 0xd bytes C > mod_inv_on_tsx_state(pjsip_transaction * tsx=0x05e270cc, > pjsip_event * e=0x05a1058c) Line 535 + 0x19 bytes C > pjsip_dlg_on_tsx_state(pjsip_dialog * dlg=0x05cd377c, > pjsip_transaction * tsx=0x05e270cc, pjsip_event * > e=0x05a1058c) Line 1907 + 0x1c bytes C > mod_ua_on_tsx_state(pjsip_transaction * tsx=0x05e270cc, > pjsip_event * e=0x05a1058c) Line 178 + 0x11 bytes C > tsx_set_state(pjsip_transaction * tsx=0x05e270cc, > pjsip_tsx_state_e state=PJSIP_TSX_STATE_COMPLETED, pjsip_event_id_e > event_src_type=PJSIP_EVENT_RX_MSG, void * > event_src=0x04a6c3ec) Line 1095 + 0x15 bytes C > tsx_on_state_proceeding_uac(pjsip_transaction * tsx=0x05e270cc, > pjsip_event * event=0x05a107f8) Line 2642 + 0x14 bytes C > pjsip_tsx_recv_msg(pjsip_transaction * tsx=0x05e270cc, > pjsip_rx_data * rdata=0x04a6c3ec) Line 1570 + 0x12 bytes C > mod_tsx_layer_on_rx_response(pjsip_rx_data * > rdata=0x04a6c3ec) Line 817 + 0xd bytes C > endpt_on_rx_msg(pjsip_endpoint * endpt=0x03327f1c, int status=0, > pjsip_rx_data * rdata=0x04a6c3ec) Line 927 + 0x11 bytes C > pjsip_tpmgr_receive_packet(pjsip_tpmgr * mgr=0x048dc2a8, > pjsip_rx_data * rdata=0x04a6c3ec) Line 1457 + 0x17 bytes C > udp_on_read_complete(pj_ioqueue_key_t * key=0x047aed88, > pj_ioqueue_op_key_t * op_key=0x04a6c3f8, long bytes_read=482) Line > 166 + 0x16 bytes C > ioqueue_dispatch_read_event(pj_ioqueue_t * ioqueue=0x047a007c, > pj_ioqueue_key_t * h=0x047aed88) Line 552 + 0x16 bytes C > pj_ioqueue_poll(pj_ioqueue_t * ioqueue=0x047a007c, const > pj_time_val * timeout=0x05a1fccc) Line 764 + 0x17 bytes C > pjsip_endpt_handle_events2(pjsip_endpoint * endpt=0x03327f1c, > const pj_time_val * max_timeout=0x05a1fdc8, unsigned int * > p_count=0x05a1fdd8) Line 719 + 0x10 bytes C > pjsua_handle_events(unsigned int msec_timeout=10) Line 1217 + 0x14 bytes C > worker_thread(void * arg=0x00000000) Line 519 + 0x7 bytes C > thread_main(void * param=0x04a68aa4) Line 414 + 0x11 bytes C > >*****Thread 2 (stream.c)****** >static pj_status_t get_frame( pjmedia_port *port, pjmedia_frame *frame) >{ >....... >/* Lock jitter buffer mutex first */ >pj_mutex_lock( stream->jb_mutex ); >....... >} > >****Calling stack for Thread 2**** > > pj_mutex_lock(pj_mutex_t * mutex=0x05ccf534) Line 911 + 0xc bytes C > get_frame(pjmedia_port * port=0x05ccefec, pjmedia_frame * > frame=0x05b1f96c) Line 222 + 0xf bytes C > pjmedia_port_get_frame(pjmedia_port * port=0x05ccefec, > pjmedia_frame * frame=0x05b1f96c) Line 68 + 0x12 bytes C > read_port(pjmedia_conf * conf=0x033229fc, conf_port * > cport=0x05cc9f80, short * frame=0x04a6acf8, unsigned int count=320, > pjmedia_frame_type * type=0x05b1fb20) Line 1363 + 0x10 bytes C > get_frame(pjmedia_port * this_port=0x04a5becc, pjmedia_frame * > frame=0x05b1fd80) Line 1771 + 0x22 bytes C > pjmedia_port_get_frame(pjmedia_port * port=0x04a5becc, > pjmedia_frame * frame=0x05b1fd80) Line 68 + 0x12 bytes C > clock_callback(const pj_timestamp * ts=0x04a6afdc, void * > user_data=0x04a6acdc) Line 181 + 0x10 bytes C > clock_thread(void * arg=0x04a6afc4) Line 255 + 0x18 bytes C > thread_main(void * param=0x04a6b00c) Line 414 + 0x11 bytes C > >--------- > >It seems that thread 1 is waiting on a lock which is locked by >thread 2 (conf->mutex->OwningThread points to thread 2's ID) and >thread 2 is waiting on a lock which is locked by thread 3 >(stream->jb_mutex->OwningThread points to thread 3's ID). Thread 3 >details is as follows: > >*****Thread 3 (jbuf.c)****** > > jb_framelist_put_at(jb_framelist_t * framelist=0x05ccf574, > unsigned int index=15447332, const void * frame=0x05baccb4, > unsigned int frame_size=80, unsigned int bit_info=3435973836) Line > 241 + 0x1c bytes C > pjmedia_jbuf_put_frame2(pjmedia_jbuf * jb=0x05ccf56c, const void > * frame=0x05baccb4, unsigned int frame_size=80, unsigned int > bit_info=3435973836, int frame_seq=15447332, int * > discarded=0x05910b2c) Line 546 + 0x1c bytes C > on_rx_rtp(void * data=0x05ccefe4, void * pkt=0x05bacca8, long > bytes_read=172) Line 1404 + 0x47 bytes C > srtp_rtp_cb(void * user_data=0x05cd2704, void * pkt=0x05bacca8, > long size=172) Line 809 + 0x1f bytes C > on_rx_rtp(pj_ioqueue_key_t * key=0x048f8674, pj_ioqueue_op_key_t > * op_key=0x05bab268, long bytes_read=172) Line 471 + 0x17 bytes C > ioqueue_dispatch_read_event(pj_ioqueue_t * ioqueue=0x048df834, > pj_ioqueue_key_t * h=0x048f8674) Line 552 + 0x16 bytes C > pj_ioqueue_poll(pj_ioqueue_t * ioqueue=0x048df834, const > pj_time_val * timeout=0x0591feac) Line 764 + 0x17 bytes C > worker_proc(void * arg=0x048dd91c) Line 269 + 0x13 bytes C > thread_main(void * param=0x04a4d940) Line 414 + 0x11 bytes C >Unlike thread 1 and 2 which are frozen waiting for the locks, thread >3 is still running and keeps on printing the following message: > > 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame > 19:06:42.156 Master/sound 257 samples reduced, buf_cnt=656 > 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame > 19:06:42.156 Master/sound 146 samples reduced, buf_cnt=670 > 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame > 19:06:42.156 Master/sound 185 samples reduced, buf_cnt=645 > 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame > 19:06:42.156 Master/sound 164 samples reduced, buf_cnt=641 > 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame >.......... > >Thread 3 never stops printing the message and never releases the >lock (stream->jb_mutex). Therefore, a deadlock occurs between thread >1, 2 and thread 3. > > From what I understand, thread 1 attempts to add a new conference > port for the call which is in 'EARLY' stage. Thread 2 is > responsible for maintaining the audio stream between the conference > port. Thread 3 is responsible for sending/receiving socket data. I > could not figured out the exact location where threads 3 acquire > the lock, thus causing the deadlock later on. > >Any ideas on how I can fix this problem? Is this a bug with PJSIP, >or has it got anything to do with my code? > >Any advice is appreciated. Thanks in advance. > > >_______________________________________________ >Visit our blog: http://blog.pjsip.org > >pjsip mailing list >pjsip at lists.pjsip.org >http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/attachments/20090615/e8829713/attachment.html>