Yaron Zehavi <yaron.zehavi3 <at> gmail.com> writes: > > > Hello, > ? > I'm developing a softphone for android on top of PJSUA. > Every time the client moves between Wifi/3G and receives new IP address I restart PJ. > I noticed that sometimes when calling pjsua_destory it freezes on join on one of the worker threads (configured to 3). > I added some logs and it seems that the following happens: > ? > 1. At some point during registration, the tcp transport is closed with error. > 2. Thread?X (see tid 6f2) below receives the callback, schedules a retry timer (60 sec ?), and continues below... > 3. Thread Y (see tid 6f1 in bold) below receives timer callback for the same transaction, and tries to lock it but waits since it is already locked. > 4. Thread X continues, and schedules destroy timer > > 5. Thread X executes destroy timer an destroys the transaction including its mutex. > 5. Thread Y is waiting forever.... > ? > Later on when trying to join the worker threads, join on thread Y does not return. > ? > Can it be that one thread destroys the transaction that another thread tries to lock on, thus leading to thread freeze? > ? > How can it be solved ? > ? > Thanks, > > Yaron > ? > ? > Note - the logs include my logs as well, some pj logs are suppressed. > ? > 03-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.827?? tcpc0x571584? TCP connection closed > 03-17 02:00:28.825 W/libpjsip(17072):??? 02:00:28.827??? tsx0x56bf0c? locking_tsx 199303-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.827??? tsx0x56bf0c? trying lock_tsx 47eafbf8 tid 6f203-17 02:00:28.825 D/libpjsip (17072):??? 02:00:28.827??? tsx0x56bf0c? after lock_tsx 47eafbf8 prev 0 tid 6f2 > 03-17 02:00:28.825 V/libpjsip(17072):??? 02:00:28.827??? tsx0x56bf0c? State changed from Calling to Terminated, event=TRANSPORT_ERROR03-17 02:00:28.825 W/libpjsip(17072):??? 02:00:28.827??? pjsua_acc.c? SIP registration failed, status=503 (End of file (PJ_EEOF)) > 03-17 02:00:28.825 V/libpjsip(17072):??? 02:00:28.827 sip_endpoint.c? pjsip_endpt_cancel_timer(entry=0x817ca17c)03-17 02:00:28.825 V/libpjsip (17072):??? 02:00:28.827 sip_endpoint.c? pjsip_endpt_schedule_timer (entry=0x817ca17c, delay=60.0) > ? > 03-17 02:00:28.825 V/libpjsip(17072):??? 02:00:28.828??? tsx0x56bf0c? Timeout timer event > 03-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.828 sip_transactio? tsx_timer_callback...before lock. self 4050f0/6f1 > 03-17 02:00:28.825 W/libpjsip(17072):??? 02:00:28.828??? tsx0x56bf0c? locking_tsx 112403-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.828??? tsx0x56bf0c? trying lock_tsx 47db5e4c tid 6f1 > ? > > 03-17 02:00:28.830 V/libpjsip(17072):??? 02:00:28.836??? atm0x581a84? Mutex destroyed by thread pjsua03-17 02:00:28.830 V/libpjsip(17072):??? 02:00:28.836?? regc0x5817f8? Mutex destroyed by thread pjsua03-17 02:00:28.830 V/libpjsip(17072):??? 02:00:28.836 sip_endpoint.c? Releasing pool regc0x5817f8 > 03-17 02:00:28.830 V/libpjsip(17072):??? 02:00:28.836?? regc0x5817f8? destroy (): cap=3072, used=2284(74%), block0=0x57112c-0x57151803-17 02:00:28.830 V/libpjsip(17072):??? 02:00:28.836 sip_endpoint.c? pjsip_endpt_schedule_timer (entry=0x56c01c, delay=0.0) > 03-17 02:00:28.830 W/libpjsip(17072):??? 02:00:28.836??? tsx0x56bf0c? unlocking_tsx 200603-17 02:00:28.830 D/libpjsip(17072):??? 02:00:28.836??? tsx0x56bf0c? before unlock_tsx 47eafbf8 prev 0 alive 1 tid 6f203-17 02:00:28.830 D/libpjsip(17072):??? 02:00:28.836??? tsx0x56bf0c? After unlock_tsx 47eafbf8 tid 6f2 > 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transport_? on_data_read...after tcp_init_shutdown...self 3601c0/6f2 > > 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? Timeout timer event > 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transactio? tsx_timer_callback...before lock. self 3601c0/6f2 > 03-17 02:00:28.835 W/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? locking_tsx 112403-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? trying lock_tsx 47eb5e4c tid 6f2 > 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? after lock_tsx 47eb5e4c prev 0 tid 6f203-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transactio? tsx_timer_callback...before handler. self 3601c0/6f2 > 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? State changed from Terminated to Destroyed, event=TIMER03-17 02:00:28.835 V/libpjsip (17072):??? 02:00:28.837??????? hashtbl? 0x4221e4: p_entry 0x56c160 deleted > 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837 sip_endpoint.c? pjsip_endpt_schedule_timer(entry=0x57162c, delay=0.0)03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837?? tdta0x4e0078? Destroying txdata Request msg REGISTER/cseq=54208 (tdta0x4e0078) > 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? atm0x4e042c? Mutex destroyed by thread pjsua03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837 sip_endpoint.c? Releasing pool tdta0x4e007803-17 02:00:28.835 V/libpjsip (17072):??? 02:00:28.837?? tdta0x4e0078? destroy(): cap=12096, used=6740(55%), block0=0x581f54-0x583e80 > 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? Mutex destroyed by thread pjsua03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? Mutex destroyed by thread pjsua03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? Transaction destroyed! > 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837 sip_endpoint.c? Releasing pool tsx0x56bf0c03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? destroy(): cap=2048, used=800(39%), block0=0x56bf0c-0x56c6a8 > 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transactio? tsx_timer_callback...after handler. self 3601c0/6f203-17 02:00:28.835 W/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? unlocking_tsx 112803-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? before unlock_tsx 47eb5e4c prev 0 alive 0 tid 6f2 > 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? After unlock_tsx 47eb5e4c tid 6f203-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??????? hashtbl? 0x457c14: p_entry 0x572954 deleted03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? rtd0x20a300? destroy(): cap=4096, used=100(2%), block0=0x20a364-0x20b300 > 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??????????? tcp? Mutex destroyed by thread pjsua03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? atm0x5728b8? Mutex destroyed by thread pjsua03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837?? tcpc0x571584? TCP transport destroyed with reason 70016: End of file (PJ_EEOF) > 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transport_? tcp_destroy... self 3601c0/6f203-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??????????? tcp? destroy(): cap=6144, used=5676(92%), block0=0x44c0fc-0x44c2e8 > > I think a ticket should be opened for this issue. Thanks, Yaron