Transaction mutex lock/destory problem BUG

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yaron Zehavi <yaron.zehavi3 <at> gmail.com> writes:

> 
> 
> Hello,
> ?
> I'm developing a softphone for android on top of PJSUA. 
> Every time the client moves between Wifi/3G and receives new IP address I 
restart PJ. 
> I noticed that sometimes when calling pjsua_destory it freezes on join on 
one of the worker threads (configured to 3).
> I added some logs and it seems that the following happens:
> ?
> 1. At some point during registration, the tcp transport is closed with error.
> 2. Thread?X (see tid 6f2) below receives the callback, schedules a retry 
timer (60 sec ?), and continues below...
> 3. Thread Y (see tid 6f1 in bold) below receives timer callback for the same 
transaction, and tries to lock it but waits since it is already locked.
> 4. Thread X continues, and schedules destroy timer
> 
> 5. Thread X executes destroy timer an destroys the transaction including its 
mutex.
> 5. Thread Y is waiting forever....
> ?
> Later on when trying to join the worker threads, join on thread Y does not 
return.
> ?
> Can it be that one thread destroys the transaction that another thread tries 
to lock on, thus leading to thread freeze?
> ?
> How can it be solved ?
> ?
> Thanks,
> 
> Yaron
> ?
> ?
> Note - the logs include my logs as well, some pj logs are suppressed.
> ?
> 03-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.827?? tcpc0x571584? TCP 
connection closed
> 03-17 02:00:28.825 W/libpjsip(17072):??? 02:00:28.827??? tsx0x56bf0c? 
locking_tsx 199303-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.827??? 
tsx0x56bf0c? trying lock_tsx 47eafbf8 tid 6f203-17 02:00:28.825 D/libpjsip
(17072):??? 02:00:28.827??? tsx0x56bf0c? after lock_tsx 47eafbf8 prev 0 tid 6f2
> 03-17 02:00:28.825 V/libpjsip(17072):??? 02:00:28.827??? tsx0x56bf0c? State 
changed from Calling to Terminated, event=TRANSPORT_ERROR03-17 02:00:28.825 
W/libpjsip(17072):??? 02:00:28.827??? pjsua_acc.c? SIP registration failed, 
status=503 (End of file (PJ_EEOF))
> 03-17 02:00:28.825 V/libpjsip(17072):??? 02:00:28.827 sip_endpoint.c? 
pjsip_endpt_cancel_timer(entry=0x817ca17c)03-17 02:00:28.825 V/libpjsip
(17072):??? 02:00:28.827 sip_endpoint.c? pjsip_endpt_schedule_timer
(entry=0x817ca17c, delay=60.0)
> ?
> 03-17 02:00:28.825 V/libpjsip(17072):??? 02:00:28.828??? tsx0x56bf0c? 
Timeout timer event
> 03-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.828 sip_transactio? 
tsx_timer_callback...before lock. self 4050f0/6f1
> 03-17 02:00:28.825 W/libpjsip(17072):??? 02:00:28.828??? tsx0x56bf0c? 
locking_tsx 112403-17 02:00:28.825 D/libpjsip(17072):??? 02:00:28.828??? 
tsx0x56bf0c? trying lock_tsx 47db5e4c tid 6f1
> ?
> 
> 03-17 02:00:28.830 V/libpjsip(17072):??? 02:00:28.836??? atm0x581a84? Mutex 
destroyed by thread pjsua03-17 02:00:28.830 V/libpjsip(17072):??? 
02:00:28.836?? regc0x5817f8? Mutex destroyed by thread pjsua03-17 02:00:28.830 
V/libpjsip(17072):??? 02:00:28.836 sip_endpoint.c? Releasing pool regc0x5817f8
> 03-17 02:00:28.830 V/libpjsip(17072):??? 02:00:28.836?? regc0x5817f8? destroy
(): cap=3072, used=2284(74%), block0=0x57112c-0x57151803-17 02:00:28.830 
V/libpjsip(17072):??? 02:00:28.836 sip_endpoint.c? pjsip_endpt_schedule_timer
(entry=0x56c01c, delay=0.0)
> 03-17 02:00:28.830 W/libpjsip(17072):??? 02:00:28.836??? tsx0x56bf0c? 
unlocking_tsx 200603-17 02:00:28.830 D/libpjsip(17072):??? 02:00:28.836??? 
tsx0x56bf0c? before unlock_tsx 47eafbf8 prev 0 alive 1 tid 6f203-17 
02:00:28.830 D/libpjsip(17072):??? 02:00:28.836??? tsx0x56bf0c? After 
unlock_tsx 47eafbf8 tid 6f2
> 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transport_? 
on_data_read...after tcp_init_shutdown...self 3601c0/6f2
> 
> 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? 
Timeout timer event
> 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transactio? 
tsx_timer_callback...before lock. self 3601c0/6f2
> 03-17 02:00:28.835 W/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? 
locking_tsx 112403-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? 
tsx0x56bf0c? trying lock_tsx 47eb5e4c tid 6f2
> 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? after 
lock_tsx 47eb5e4c prev 0 tid 6f203-17 02:00:28.835 D/libpjsip(17072):??? 
02:00:28.837 sip_transactio? tsx_timer_callback...before handler. self 
3601c0/6f2
> 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? State 
changed from Terminated to Destroyed, event=TIMER03-17 02:00:28.835 V/libpjsip
(17072):??? 02:00:28.837??????? hashtbl? 0x4221e4: p_entry 0x56c160 deleted
> 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837 sip_endpoint.c? 
pjsip_endpt_schedule_timer(entry=0x57162c, delay=0.0)03-17 02:00:28.835 
V/libpjsip(17072):??? 02:00:28.837?? tdta0x4e0078? Destroying txdata Request 
msg REGISTER/cseq=54208 (tdta0x4e0078)
> 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? atm0x4e042c? Mutex 
destroyed by thread pjsua03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837 
sip_endpoint.c? Releasing pool tdta0x4e007803-17 02:00:28.835 V/libpjsip
(17072):??? 02:00:28.837?? tdta0x4e0078? destroy(): cap=12096, used=6740(55%), 
block0=0x581f54-0x583e80
> 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? Mutex 
destroyed by thread pjsua03-17 02:00:28.835 V/libpjsip(17072):??? 
02:00:28.837??? tsx0x56bf0c? Mutex destroyed by thread pjsua03-17 02:00:28.835 
V/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? Transaction destroyed!
> 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837 sip_endpoint.c? 
Releasing pool tsx0x56bf0c03-17 02:00:28.835 V/libpjsip(17072):??? 
02:00:28.837??? tsx0x56bf0c? destroy(): cap=2048, used=800(39%), 
block0=0x56bf0c-0x56c6a8
> 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transactio? 
tsx_timer_callback...after handler. self 3601c0/6f203-17 02:00:28.835 
W/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? unlocking_tsx 112803-17 
02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? before 
unlock_tsx 47eb5e4c prev 0 alive 0 tid 6f2
> 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837??? tsx0x56bf0c? After 
unlock_tsx 47eb5e4c tid 6f203-17 02:00:28.835 V/libpjsip(17072):??? 
02:00:28.837??????? hashtbl? 0x457c14: p_entry 0x572954 deleted03-17 
02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??? rtd0x20a300? destroy(): 
cap=4096, used=100(2%), block0=0x20a364-0x20b300
> 03-17 02:00:28.835 V/libpjsip(17072):??? 02:00:28.837??????????? tcp? Mutex 
destroyed by thread pjsua03-17 02:00:28.835 V/libpjsip(17072):??? 
02:00:28.837??? atm0x5728b8? Mutex destroyed by thread pjsua03-17 02:00:28.835 
D/libpjsip(17072):??? 02:00:28.837?? tcpc0x571584? TCP transport destroyed 
with reason 70016: End of file (PJ_EEOF)
> 03-17 02:00:28.835 D/libpjsip(17072):??? 02:00:28.837 sip_transport_? 
tcp_destroy... self 3601c0/6f203-17 02:00:28.835 V/libpjsip(17072):??? 
02:00:28.837??????????? tcp? destroy(): cap=6144, used=5676(92%), 
block0=0x44c0fc-0x44c2e8
> 
> 


I think a ticket should be opened for this issue.

Thanks,
Yaron




[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux