Re: Segfault in timer.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ming,

I am using Asterisk bundled, which is PJSIP 2.8 with patches:

"Fixed #2191:   
- Stricter double timer entry scheduling prevention.  
- Integrate group lock in SIP transport, e.g: for add/dec ref,     for timer scheduling."

I am have just had a further crash in a different location, which looks like its crashed due to a timer entry being cancelled and an error is happening while moving nodes. Is more locking required when performing operations on the timer heap?

Thread 1 (Thread 0x7f783ebfa700 (LWP 83829)):
#0  0x00007f7ae3dc0249 in copy_node (ht=0x184cbf0, slot=2305, moved_node=0x7f797732f0c0) at ../src/pj/timer.c:137
#1  0x00007f7ae3dc0677 in reheap_up (ht=0x184cbf0, moved_node=0x7f780e0f2990, slot=2305, parent=1152) at ../src/pj/timer.c:208
#2  0x00007f7ae3dc0882 in remove_node (ht=0x184cbf0, slot=2305) at ../src/pj/timer.c:254
        parent = 1152
        moved_node = 0x7f780e0f2990
        removed_node = 0x7f797f4a9e08
#3  0x00007f7ae3dc0b64 in cancel (ht=0x184cbf0, entry=0x7f797f4a9e08, flags=1) at ../src/pj/timer.c:353
        timer_node_slot = 2305
#4  0x00007f7ae3dc1061 in cancel_timer (ht=0x184cbf0, entry=0x7f797f4a9e08, flags=0, id_val=0) at ../src/pj/timer.c:591
        count = 32634
#5  0x00007f7ae3dc10ea in pj_timer_heap_cancel (ht=0x184cbf0, entry=0x7f797f4a9e08) at ../src/pj/timer.c:611
#6  0x00007f7ae3d0f0c8 in pjsip_endpt_cancel_timer (endpt=0x184c908, entry=0x7f797f4a9e08) at ../src/pjsip/sip_endpoint.c:848
#7  0x00007f7ae3cf62f6 in set_timer (sub=0x7f797f4a9cc8, timer_id=2, seconds=3600) at ../src/pjsip-simple/evsub.c:508
#8  0x00007f7ae3cf91cc in on_tsx_state_uas (sub=0x7f797f4a9cc8, tsx=0x7f780e0f27d8, event=0x7f783ebf98c0) at ../src/pjsip-simple/evsub.c:2172
        expires = 0x7f7a10f9e600
        tdata = 0x7f799cbcc068
        st_code = 200
        st_text = 0x0
        body = 0x0
        old_state = PJSIP_EVSUB_STATE_ACTIVE
        old_state_str = {ptr = 0x7f7ae3dc6f0c "ACTIVE", slen = 6}
        reason = {ptr = 0x0, slen = 0}
        rdata = 0x7f7a10f95a68
        msg = 0x7f7a10f9dac0
        res_hdr = {prev = 0x7f783ebf9700, next = 0x7f783ebf9700, type = 284812696, name = {ptr = 0x7f797e162ca8 "", slen = 284809960}, sname = {ptr = 0x7f797f4a9cc8 "evsub0x7f797f4a9cc8", slen = 140154425546736}, vptr = 0x7f7ae3cf7f2f <on_new_transaction+724>}
        status = 0
        event_hdr = 0x7f7a10f9e548

Kind regards,

Ross


From: pjsip <pjsip-bounces@xxxxxxxxxxxxxxx> on behalf of Ming <ming@xxxxxxxxx>
Sent: 30 May 2019 07:20
To: pjsip list
Subject: Re: Segfault in timer.c
 
Hi Ross,

- Is there any steps or specific scenario to reproduce the issue?
- Which PJSIP version are you using? You mentioned ticket #2176, which
additional patches do you apply to the base version you're using?
- Our latest fix with regard to the timer is ticket #2191
(https://trac.pjsip.org/repos/ticket/2191). Please let us know if the
problem still persists after this patch. Another ticket which may be
relevant is ticket #2172 (https://trac.pjsip.org/repos/ticket/2172).
- For the stack trace which you provided, the crash seems to be caused
when accessing the node 0x7fa9c4c04c50. So, you  will need to check
the PJSIP log, with PJ_TIMER_DEBUG on, and see to which component the
timer entry belongs, and when&how the pool which allocates the entry
gets destroyed, and why isn't the timer cancelled first before the
deallocation.

Regards,
Ming


On Tue, Apr 9, 2019 at 9:04 PM Ross Beer <ross.beer@xxxxxxxxxxx> wrote:
>
> Hello,
>
> The previous segfault was with issue #2176 (https://trac.pjsip.org/repos/changeset/5934) patch applied, this patch appears to increase the frequency of segfaults in the 'pj_timer_heap_poll' process instead of mitigating them.
>
> Can anyone offer assistance in getting this resolved?
>
> Regards,
>
> Ross
>
> ________________________________
> From: pjsip <pjsip-bounces@xxxxxxxxxxxxxxx> on behalf of Ross Beer <ross.beer@xxxxxxxxxxx>
> Sent: 08 April 2019 17:05
> To: pjsip@xxxxxxxxxxxxxxx
> Subject: Segfault in timer.c
>
> Hi,
>
> We are seeing multiple segfaults while copying nodes:
>
> Thread 1 (Thread 0x7fa977fff700 (LWP 36699)):
> #0  0x00007fac956676fe in copy_node (ht=0x16d0410, slot=430, moved_node=0x7fa9c4c04c50) at ../src/pj/timer.c:137
> #1  0x00007fac956679d9 in reheap_down (ht=0x16d0410, moved_node=0x7fab60031970, slot=430, child=862) at ../src/pj/timer.c:185
> #2  0x00007fac95667d1d in remove_node (ht=0x16d0410, slot=0) at ../src/pj/timer.c:252
>         parent = 0
>         moved_node = 0x7fab60031970
>         removed_node = 0x7fa9b5978ae0
> #3  0x00007fac95668634 in pj_timer_heap_poll (ht=0x16d0410, next_delay=0x7fa977ffecd0) at ../src/pj/timer.c:634
>         node = 0x100000002
>         grp_lock = 0x31dc8c88
>         now = {sec = 8157205, msec = 523}
>         count = 0
> #4  0x00007fac955b6b06 in pjsip_endpt_handle_events2 (endpt=0x16d0128, max_timeout=0x7fa977ffed40, p_count=0x0) at ../src/pjsip/sip_endpoint.c:715
>         timeout = {sec = 0, msec = 0}
>         count = 0
>         net_event_count = 0
>         c = 0
> #5  0x00007fac955b6c4b in pjsip_endpt_handle_events (endpt=0x16d0128, max_timeout=0x7fa977ffed40) at ../src/pjsip/sip_endpoint.c:776
> #6  0x00007faac8024bcf in monitor_thread_exec (endpt=0x0) at res_pjsip.c:4512
>         delay = {sec = 0, msec = 10}
> #7  0x00007fac9564fcb0 in thread_main (param=0x1815b28) at ../src/pj/os_core_unix.c:541
>         rec = 0x1815b28
>         result = 0x0
>         rc = 0
> #8  0x00007fac9342ddd5 in start_thread () at /usr/lib64/libpthread.so.0
> #9  0x00007fac927cfead in clone () at /usr/lib64/libc.so.6
>
>
> Can anyone assist in resolving the issue?
>
> Regards,
>
> Ross
> _______________________________________________
> Visit our blog: http://blog.pjsip.org
>
> pjsip mailing list
> pjsip@xxxxxxxxxxxxxxx
> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org

_______________________________________________
Visit our blog: http://blog.pjsip.org

pjsip mailing list
pjsip@xxxxxxxxxxxxxxx
http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
_______________________________________________
Visit our blog: http://blog.pjsip.org

pjsip mailing list
pjsip@xxxxxxxxxxxxxxx
http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org

[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux