Re: [PATCH] Make ifaces_get work with dynamic no_rings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/03/12 08:53 +0200, Jan Friesse wrote:
Steven Dake napsal(a):
On 03/26/2012 10:08 AM, David Teigland wrote:
On Mon, Mar 26, 2012 at 10:36:35AM +0200, Jan Friesse wrote:
Commit which added number of addresses to srp_address structure didn't
count with totemsrp_ifaces_get where whole structure was copied instead
of addresses only. This is now fixed.

Also to make API totempg forward compatible, size of interfaces array
must be passed to ifaces_get like functions to prevent memory overwrite.

Thanks, that fixes all the problems I was having on Friday.  It gets me
back to being able to test another problem that I'd been seeing earlier
(which I was told was probably fixed.)  I'm still seeing that previous
problem in dlm_controld using libqb and corosync master branches,

libqb 50f07abcfe7010360f697d1f26b56fec863d892f
corosync e925f421658960d568f168c3623b23414dddcc1c

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fea700 (LWP 7026)]
0x00007ffff7bcd7a0 in sem_trywait () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6.x86_64
(gdb) bt
#0  0x00007ffff7bcd7a0 in sem_trywait () from /lib64/libpthread.so.0
#1  0x00007ffff6bc3805 in my_posix_sem_timedwait (rb=0x555555a7e060, ms_timeout=0)
    at ringbuffer_helper.c:39
#2  0x00007ffff6bc2ee2 in qb_rb_chunk_read (rb=0x555555a7e060, data_out=0x7fffffefce50,
    len=1048576, timeout=<value optimized out>) at ringbuffer.c:548
#3  0x00007ffff6bc612a in qb_ipcc_event_recv (c=0x555555c9b480, msg_pt=0x7fffffefce50,
    msg_len=1048576, ms_timeout=0) at ipcc.c:321
#4  0x00007ffff77b5966 in cpg_dispatch (handle=1003584061500817409,
    dispatch_types=<value optimized out>) at cpg.c:356
#5  0x000055555555da94 in process_cpg_lockspace (ci=6) at cpg.c:1574
#6  0x0000555555565cf4 in loop () at main.c:978
#7  0x0000555555566b36 in main (argc=1, argv=0x7fffffffe5a8) at main.c:1356

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


RUn corosync-fplay

There is not enough info in this bug report.

Could you go into more detail what you are doing.  Is it possible you
are calling cpg_finalize() while dispatch is processing?

Regards
-steve

Steve,
I'm also getting similar segfaults from time to time, very randomly. Only what I can say, that chance to get this error is (my testing environment so can vary) much higher for first IPC connection very soon after exec of corosync. I also believe that older libqb versions (like 0.8.1 or so) didn't had this problem (but it's possible that thanks to pile of another bug, I was simply not able to hit it).

Anyway, it's for sure bug in libqb, so CC'ing Angus.

Well if anyone can help with a reproducer that would help a lot.

I initally thought this was a shutdown issue solved ages ago, but
I guess not.

-Angus

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss


[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux