Crash with TURN enabled caused by an apparent race condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello

I'm running PJSIP 2.6 on iOS and I've had this crash happening here
and there for almost two years. I recently noticed an uptick of
occurrences so I am trying to figure out what can I do about it.

This crash occurs when receiving a BYE message and the client attempts
to destroy the ICE stream transport. If this takes too long, we end up
with a crash when calling pj_grp_lock_acquire(sess->grp_lock); in
pj_turn_session_on_rx_pkt in turn_session.c:1154, I'm assuming because
sess becomes NULL.

In turn_sock.c:611 in on_data_read there's:

while ((pkt_len=has_packet(turn_sock, data, size)) != 0) {
...
pj_turn_session_on_rx_pkt(turn_sock->sess, data,  size, &parsed_len);
...
}

But I guess turn_sock->sess can become NULL at any point during the iteration.

Is there anything I can do (that won't throw off everything else) to
make sure pj_turn_session_on_rx_pkt isn't called when the sess is
already NULL? I'm thinking of just adding a check around it and keep
the rest of the code as is, but I'm not sure if that won't mess up the
while condition and crash within has_packet or anywhere else. Since
this crash happens only sporadically it will be a while until I get
confirmation that the following is a solution:

if (turn_sock->sess) {
    pj_turn_session_on_rx_pkt(turn_sock->sess, data,  size, &parsed_len);
}

Does anyone have a better suggestion?

I managed to capture the logs of what's happening immediately before
the crash, but due to the asynchronous nature of the logger it might
have missed a couple of statements. Here is the stack trace:

Crashed: Thread
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x0000000000000078

0  Application             0x100a9eaf4 pj_turn_session_on_rx_pkt
(turn_session.c:1154)
1  Application             0x100aa01e8 on_data_read (turn_sock.c:629)
2  Application             0x100aa01e8 on_data_read (turn_sock.c:629)
3  Application             0x100a7f294 ioqueue_on_read_complete
(activesock.c:493)
4  Application             0x100a79bc0 ioqueue_dispatch_read_event
(ioqueue_common_abs.c:605)
5  Application             0x100a7b484 pj_ioqueue_poll (ioqueue_select.c:994)
6  Application             0x100aba9c0 pjsip_endpt_handle_events2
(sip_endpoint.c:742)
7  Application             0x100615c58 worker_thread (pjsua_core.c:1986)
8  Application             0x100a7c5c4 thread_main (os_core_unix.c:541)
9  libsystem_pthread.dylib        0x18a499850 <redacted> + 240
10 libsystem_pthread.dylib        0x18a499760 _pthread_start + 282
11 libsystem_pthread.dylib        0x18a496d94 thread_start + 4


And here is the log:

2017/03/22 11:12:46.715  VERBOSE:  dlg0x1033e18a8      pjsua_0
.Received Request msg BYE/cseq=595 (rdata0x1033c0140)
2017/03/22 11:12:46.715  VERBOSE:  tsx0x10398dea8      pjsua_0
...Transaction created for Request msg BYE/cseq=595 (rdata0x1033c0140)
2017/03/22 11:12:46.715  VERBOSE:  tsx0x10398dea8      pjsua_0
..Incoming Request msg BYE/cseq=595 (rdata0x1033c0140) in state Null
2017/03/22 11:12:46.715  VERBOSE:  tsx0x10398dea8      pjsua_0
...State changed from Null to Trying, event=RX_MSG
2017/03/22 11:12:46.715  VERBOSE:  dlg0x1033e18a8      pjsua_0
....Transaction tsx0x10398dea8 state changed to Trying
2017/03/22 11:12:46.715  VERBOSE:        endpoint      pjsua_0
.....Response msg 200/BYE/cseq=595 (tdta0x10392aa00) created
2017/03/22 11:12:46.715  VERBOSE:  dlg0x1033e18a8      pjsua_0
......Sending Response msg 200/BYE/cseq=595 (tdta0x10392aa00)
2017/03/22 11:12:46.715  VERBOSE:  tsx0x10398dea8      pjsua_0
......Sending Response msg 200/BYE/cseq=595 (tdta0x10392aa00) in state
Trying
2017/03/22 11:12:46.715  VERBOSE:    pjsua_core.c      pjsua_0
.......TX 703 bytes Response msg 200/BYE/cseq=595 (tdta0x10392aa00) to
TLS [SIP server IP]:443:
SIP/2.0 200 OK
Via: SIP/2.0/TLS [SIP server IP]:443;received=[SIP server
IP];branch=z9hG4bK6c77.3f1525aecb045c49ffb1082b201c3c4d.0
Via: SIP/2.0/UDP [another server
IP];branch=z9hG4bK6c77.cf3f5a844f1e397bddfe8a2c10e81989.0
Via: SIP/2.0/UDP [SIP server
IP];branch=z9hG4bK6c77.5f1a75896a852012241c5dd197bcd3c3.0;i=80a
Via: SIP/2.0/TLS [a client ip]:49541;rport=49541;received=[a client
ip];branch=z9hG4bKPjQArtw1u5hMGdzmjSCozUeyFwJMxp2n3A;alias
Call-ID: tofoRAjPDbLHtTZWxgUm8oP5DclyIo9P
From: "alice" <sip:alice@xxxxxxxxxxx>;tag=ElzkySJZCYNml5BIlS5scPBW.SJYA3EU
To: "bob" <sip:bob@xxxxxxxxxxx>;tag=mHfpKjYRccJDJlWvEqJxM0ddEFR-.dJC
CSeq: 595 BYE
Content-Length:  0


--end msg--
2017/03/22 11:12:46.715  VERBOSE:  tsx0x10398dea8      pjsua_0
.......State changed from Trying to Completed, event=TX_MSG
2017/03/22 11:12:46.715  VERBOSE:  dlg0x1033e18a8      pjsua_0
........Transaction tsx0x10398dea8 state changed to Completed
2017/03/22 11:12:46.721  VERBOSE:   pjsua_media.c      pjsua_0
......Call 6: deinitializing media..
2017/03/22 11:12:46.721  VERBOSE:  strm0x1033e442      pjsua_0
........JB summary:
  size=3/eff=3 prefetch=0 level=11
  delay (min/max/avg/dev)=20/220/106/33 ms
  burst (min/max/avg/dev)=1/11/2/0 frames
  lost=8 discard=0 empty=50
2017/03/22 11:12:46.722  VERBOSE:   pjsua_media.c      pjsua_0
........Media stream call06:0 is destroyed
2017/03/22 11:12:46.722  VERBOSE:  transport_zrtp      pjsua_0
.......Media stop - encrypted packets: 4933, decrypted packets: 4920
2017/03/22 11:12:46.722  VERBOSE:         icetp00      pjsua_0
.......Stopping ICE, reason=media stop requested
2017/03/22 11:12:46.722  VERBOSE:         icetp00      pjsua_0
.......Destroying ICE session 0x1039f0e28
2017/03/22 11:12:46.722  VERBOSE:  stuse0x1039478      pjsua_0
.......STUN session 0x1039cac28 destroy request, ref_cnt=23
2017/03/22 11:12:46.722  VERBOSE:  stuse0x10392cc      pjsua_0
.......STUN session 0x10385bc28 destroy request, ref_cnt=22
2017/03/22 11:12:46.722  VERBOSE:  transport_zrtp      pjsua_0
.......transport_destroy()
2017/03/22 11:12:46.722  VERBOSE:  transport_zrtp      pjsua_0
.......Destroy - encrypted packets: 4933, decrypted packets: 4920
2017/03/22 11:12:46.722  VERBOSE:         icetp00      pjsua_0
.......ICE stream transport 0x1031c5028 destroy request..
2017/03/22 11:12:46.722  VERBOSE:  stuntp0x1703ee      pjsua_0
........STUN sock 0x1031c5828 request, ref_cnt=19
2017/03/22 11:12:46.722  VERBOSE:  stuntp0x1703ee      pjsua_0
........STUN session 0x103339428 destroy request, ref_cnt=19
2017/03/22 11:12:46.722  VERBOSE:  tcprel0x103144      pjsua_0
........Request to shutdown in state Ready, cause:0
2017/03/22 11:12:46.722  VERBOSE:  tcprel0x103144      pjsua_0
........State changed Ready --> Deallocating
2017/03/22 11:12:46.722  VERBOSE:         icetp00      pjsua_0
........TURN client state changed Ready --> Deallocating
2017/03/22 11:12:46.724  VERBOSE:  tcprel0x103144      pjsua_0
.........TX 164 bytes STUN message to [a public ip]:443:
--- begin STUN message ---
STUN Refresh request
 Hdr: length=144, magic=2112a442, tsx_id=000010f13b4ff98d17ce3257
 Attributes:
  LIFETIME: length=4, value=0 (0x0)
  SOFTWARE: length=16, value="pjnath-2.6.0"
  USERNAME: length=47, value="1490257935:754CFE7C-EE5C-489B-80BE-887D48CC4793"
  REALM: length=13, value="example.org"
  NONCE: length=16, value="7af8afde8dd9343a"
  MESSAGE-INTEGRITY: length=20, data=c2fdea6a544dcd8ddd52523b602845364f2ac42a
--- end of STUN message ---

2017/03/22 11:12:46.724  VERBOSE:  utsx0x1038b328      pjsua_0
.........STUN client transaction created
2017/03/22 11:12:46.724  VERBOSE:  utsx0x1038b328      pjsua_0
.........STUN sending message (transmit count=8)
2017/03/22 11:12:46.724  VERBOSE:  stuntp0x1703f2      pjsua_0
........STUN sock 0x10307c828 request, ref_cnt=17
2017/03/22 11:12:46.724  VERBOSE:  stuntp0x1703f2      pjsua_0
........STUN session 0x1033e6628 destroy request, ref_cnt=17
2017/03/22 11:12:46.724  VERBOSE:  tcprel0x103986      pjsua_0
........Request to shutdown in state Ready, cause:0
2017/03/22 11:12:46.725  VERBOSE:  tcprel0x103986      pjsua_0
........State changed Ready --> Deallocating
2017/03/22 11:12:46.725  VERBOSE:         icetp00      pjsua_0
........TURN client state changed Ready --> Deallocating
2017/03/22 11:12:46.725  VERBOSE:  tcprel0x103986      pjsua_0
.........TX 164 bytes STUN message to [a public ip]:443:
--- begin STUN message ---
STUN Refresh request
 Hdr: length=144, magic=2112a442, tsx_id=000010f17f88b66617ce3258
 Attributes:
  LIFETIME: length=4, value=0 (0x0)
  SOFTWARE: length=16, value="pjnath-2.6.0"
  USERNAME: length=47, value="1490257935:754CFE7C-EE5C-489B-80BE-887D48CC4793"
  REALM: length=13, value="example.org"
  NONCE: length=16, value="e28b9fa5b9180580"
  MESSAGE-INTEGRITY: length=20, data=f78d9ef57c72fdeb1726ab3ea278e8351881fa09
--- end of STUN message ---

2017/03/22 11:12:46.725  VERBOSE:  utsx0x1039a8a8      pjsua_0
.........STUN client transaction created
2017/03/22 11:12:46.725  VERBOSE:  utsx0x1039a8a8      pjsua_0
.........STUN sending message (transmit count=8)
2017/03/22 11:12:46.725  VERBOSE:  tdta0x10392480      pjsua_0
.....Destroying txdata Request msg ACK/cseq=13212 (tdta0x103924800)
2017/03/22 11:12:46.725  VERBOSE:  tdta0x103a2be0      pjsua_0
.....Destroying txdata Request msg INVITE/cseq=13212 (tdta0x103a2be00)
2017/03/22 11:12:46.725  VERBOSE:  dlg0x1033e18a8      pjsua_0
......Session count dec to 3 by mod-invite
2017/03/22 11:12:46.725  VERBOSE:  tsx0x10398dea8      pjsua_0
Timeout timer event
2017/03/22 11:12:46.725  VERBOSE:  tsx0x10398dea8      pjsua_0
.State changed from Completed to Terminated, event=TIMER
2017/03/22 11:12:46.725  VERBOSE:  dlg0x1033e18a8      pjsua_0
..Transaction tsx0x10398dea8 state changed to Terminated
2017/03/22 11:12:46.725  VERBOSE:  dlg0x1033e18a8      pjsua_0
...Dialog destroyed
2017/03/22 11:12:46.725  VERBOSE:  tsx0x10398dea8      pjsua_0
Timeout timer event
2017/03/22 11:12:46.725  VERBOSE:  tsx0x10398dea8      pjsua_0
.State changed from Terminated to Destroyed, event=TIMER
2017/03/22 11:12:46.725  VERBOSE:  tdta0x10392aa0      pjsua_0
..Destroying txdata Response msg 200/BYE/cseq=595 (tdta0x10392aa00)
2017/03/22 11:12:46.726  VERBOSE:  tsx0x10398dea8      pjsua_0
Transaction destroyed!
2017/03/22 11:12:46.875  VERBOSE:  tcprel0x103144      pjsua_0
.RX 204 bytes STUN message from [a public ip]:443:
--- begin STUN message ---
STUN Refresh success response
 Hdr: length=64, magic=2112a442, tsx_id=000010f13b4ff98d17ce3257
 Attributes:
  LIFETIME: length=4, value=0 (0x0)
  SOFTWARE: length=26, value="Coturn-4.5.0.3 'dan Eider'"
  MESSAGE-INTEGRITY: length=20, data=adaa149dc41092e0d932f01d40248bc2f5f17399
--- end of STUN message ---

2017/03/22 11:12:46.875  VERBOSE:  tcprel0x103144      pjsua_0
.State changed Deallocating --> Deallocated
2017/03/22 11:12:46.875  VERBOSE:  tcprel0x103144      pjsua_0
.Request to shutdown in state Deallocated, cause:0
2017/03/22 11:12:46.875  VERBOSE:  tcprel0x103144      pjsua_0
.State changed Deallocated --> Destroying
2017/03/22 11:12:46.875  VERBOSE:  stun_session.c      pjsua_0
.tdata 0x1038b30a8 destroy request, force=0, tsx=0x1038b3288
2017/03/22 11:12:46.875  VERBOSE:  utsx0x1038b328      pjsua_0
.STUN transaction 0x1038b3288 schedule destroy

Thanks!

_______________________________________________
Visit our blog: http://blog.pjsip.org

pjsip mailing list
pjsip@xxxxxxxxxxxxxxx
http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org



[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux