It looks like the issue you are experiencing was fixed in the Infernalis/master branches [1]. I've opened a new tracker ticket to backport the fix to Hammer [2]. -- Jason Dillaman [1] https://github.com/sponce/ceph/commit/e4c27d804834b4a8bc495095ccf5103f8ffbcc1e [2] http://tracker.ceph.com/issues/13210 ----- Original Message ----- > From: "Paul Mansfield" <paul.mansfield@xxxxxxxxxxxxxxxxxx> > To: "Jason Dillaman" <dillaman@xxxxxxxxxx> > Cc: ceph-users@xxxxxxxxxxxxxx > Sent: Wednesday, September 23, 2015 6:25:36 AM > Subject: Re: lttng duplicate registration problem when using librados2 and libradosstriper > > On 22/09/15 19:48, Jason Dillaman wrote: > > It's not the best answer, but it is the reason why it is currently > > disabled on RHEL 7. Best bet for finding a long-term solution is > > still probably attaching with gdb and catching the abort function > > call. Once the offending probe can be found, we can figure out how to > fix it. > > I tried gdb and strace. I didn't find anything that gave me any insight. > > Here's running it with gdb. I've not used gdb in anger in years, so > quite possibly I'm doing it wrongly > > $ gdb ./testprogram > GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1) > Copyright (C) 2010 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>... > Reading symbols from /foo/bar/testprogram...done. > (gdb) handle SIGABRT stop nopass > Signal Stop Print Pass to program Description > SIGABRT Yes Yes No Aborted > (gdb) start > Temporary breakpoint 1 at 0x4017ac: file testprogram, line 184. > Starting program: /foo/bar/testprogram > [Thread debugging using libthread_db enabled] > [New Thread 0x7fffed9da700 (LWP 53014)] > [New Thread 0x7fffed1d9700 (LWP 53015)] > LTTng-UST: Error (-17) while registering tracepoint probe. Duplicate > registration of tracepoint probes having the same name is not allowed. > > Program received signal SIGABRT, Aborted. > 0x00007ffff24b8925 in raise () from /lib64/libc.so.6 > Missing separate debuginfos, use: debuginfo-install > CUnit-2.1.2-6.el6.x86_64 boost-system-1.41.0-18.el6.x86_64 > boost-thread-1.41.0-18.el6.x86_64 cassandra-cpp-driver-2.0.1-1.el6.amd64 > glibc-2.12-1.132.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 > krb5-libs-1.10.3-15.el6_5.1.x86_64 libcom_err-1.41.12-18.el6.x86_64 > libgcc-4.4.7-4.el6.x86_64 librados2-0.94.3-0.el6.x86_64 > libradosstriper1-0.94.3-0.el6.x86_64 > libselinux-2.0.94-5.3.el6_4.1.x86_64 libstdc++-4.4.7-4.el6.x86_64 > libuuid-2.17.2-12.14.el6.x86_64 libuv-1.2.1-1.el6.x86_64 > lttng-ust-2.4.1-1.el6.x86_64 nspr-4.10.2-1.el6_5.x86_64 > nss-3.15.3-6.el6_5.x86_64 nss-util-3.15.3-1.el6_5.x86_64 > openssl-1.0.1e-16.el6_5.7.x86_64 userspace-rcu-0.7.7-1.el6.x86_64 > zlib-1.2.3-29.el6.x86_64 > (gdb) backtrace > #0 0x00007ffff24b8925 in raise () from /lib64/libc.so.6 > #1 0x00007ffff24ba105 in abort () from /lib64/libc.so.6 > #2 0x00007ffff58c58f4 in ?? () from /usr/lib64/librados.so.2 > #3 0x00007ffff58f4936 in ?? () from /usr/lib64/librados.so.2 > #4 0x00007fffffffe9a8 in ?? () > #5 0x0000000000000001 in ?? () > #6 0x00007fffffffe9a8 in ?? () > #7 0x00007ffff555f51b in _init () from /usr/lib64/librados.so.2 > #8 0x00007ffff7fea000 in ?? () > #9 0x00007ffff7deb555 in _dl_init_internal () from > /lib64/ld-linux-x86-64.so.2 > #10 0x00007ffff7dddb3a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2 > #11 0x0000000000000001 in ?? () > #12 0x00007fffffffec44 in ?? () > #13 0x0000000000000000 in ?? () > > > > This didn't tell me much. I tried using "nm" on the librados and > libradosstriper libraries and there was no symbol information. > > > I also tried strace which revealed two sub processes > > $ grep "/dev/shm/lttng-ust" strace.out > [pid 49682] open("/dev/shm/lttng-ust-wait-5", > O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3 > [pid 49683] open("/dev/shm/lttng-ust-wait-5-2489", > O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3 > > > > $ grep "pid 49682" strace.out | more > [pid 49682] set_robust_list(0x7fe69cb5b9e0, 0x18 <unfinished ...> > [pid 49682] <... set_robust_list resumed> ) = 0 > [pid 49682] socket(PF_FILE, SOCK_STREAM, 0 <unfinished ...> > [pid 49682] <... socket resumed> ) = 3 > [pid 49682] fcntl(3, F_SETFD, FD_CLOEXECProcess 49683 attached > [pid 49682] connect(3, {sa_family=AF_FILE, > path="/var/run/lttng/lttng-ust-sock-5"}, 110 <unfinished ...> > [pid 49682] <... connect resumed> ) = -1 ENOENT (No such file or > directory) > [pid 49682] close(3 <unfinished ...> > [pid 49682] <... close resumed> ) = 0 > [pid 49682] statfs("/dev/shm/", <unfinished ...> > [pid 49682] <... statfs resumed> {f_type=0x1021994, f_bsize=4096, > f_blocks=8242437, f_bfree=8242435, f_bavail=8242435, f_files=8242437, > f_ffree=8242434, f_fsid={0, 0}, f_n > amelen=255, f_frsize=4096}) = 0 > [pid 49682] futex(0x7fe69fd6b300, FUTEX_WAKE_PRIVATE, 2147483647) = 0 > [pid 49682] open("/dev/shm/lttng-ust-wait-5", > O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3 > [pid 49682] fcntl(3, F_GETFD) = 0x1 (flags FD_CLOEXEC) > [pid 49682] read(3, "\0\0\0\0", 4) = 4 > [pid 49682] mmap(NULL, 4096, PROT_READ, MAP_SHARED, 3, 0) = 0x7fe6a717b000 > [pid 49682] close(3) = 0 > [pid 49682] futex(0x7fe6a13f15e0, FUTEX_WAKE_PRIVATE, 1) = 1 > [pid 49682] futex(0x7fe6a717b000, FUTEX_WAIT, 0, NULL <unfinished ...> > [pid 49682] +++ killed by SIGABRT (core dumped) +++ > > > $ grep "pid 49683" strace.out | more > [pid 49683] set_robust_list(0x7fe69c35a9e0, 0x18 <unfinished ...> > [pid 49683] <... set_robust_list resumed> ) = 0 > [pid 49683] socket(PF_FILE, SOCK_STREAM, 0 <unfinished ...> > [pid 49683] <... socket resumed> ) = 4 > [pid 49683] fcntl(4, F_SETFD, FD_CLOEXEC) = 0 > [pid 49683] connect(4, {sa_family=AF_FILE, > path="/p4/.lttng/lttng-ust-sock-5"}, 110 <unfinished ...> > [pid 49683] <... connect resumed> ) = -1 ENOENT (No such file or > directory) > [pid 49683] close(4 <unfinished ...> > [pid 49683] <... close resumed> ) = 0 > [pid 49683] futex(0x7fe6a13f15e0, FUTEX_WAIT_PRIVATE, 2, NULL > <unfinished ...> > [pid 49683] <... futex resumed> ) = 0 > [pid 49683] futex(0x7fe6a13f1620, FUTEX_WAKE_PRIVATE, 1) = 1 > [pid 49683] futex(0x7fe6a13f15e0, FUTEX_WAKE_PRIVATE, 1) = 0 > [pid 49683] open("/dev/shm/lttng-ust-wait-5-2489", > O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3 > [pid 49683] read(3, "\0\0\0\0", 4) = 4 > [pid 49683] fstat(3, {st_mode=S_IFREG|0640, st_size=4096, ...}) = 0 > [pid 49683] getuid( <unfinished ...> > [pid 49683] <... getuid resumed> ) = 2489 > [pid 49683] mmap(NULL, 4096, PROT_READ, MAP_SHARED, 3, 0) = 0x7fe6a717a000 > [pid 49683] close(3) = 0 > [pid 49683] futex(0x7fe6a13f15e0, FUTEX_WAKE_PRIVATE, 1) = 1 > [pid 49683] futex(0x7fe6a717a000, FUTEX_WAIT, 0, NULL <unfinished ...> > [pid 49683] +++ killed by SIGABRT (core dumped) +++ > > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com