Re: lttng duplicate registration problem when using librados2 and libradosstriper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22/09/15 19:48, Jason Dillaman wrote:
> It's not the best answer, but it is the reason why it is currently
> disabled on RHEL 7.  Best bet for finding a long-term solution is
> still probably attaching with gdb and catching the abort function
> call.  Once the offending probe can be found, we can figure out how to
fix it.

I tried gdb and strace. I didn't find anything that gave me any insight.

Here's running it with gdb. I've not used gdb in anger in years, so
quite possibly I'm doing it wrongly

$ gdb ./testprogram
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /foo/bar/testprogram...done.
(gdb) handle SIGABRT stop nopass
Signal        Stop      Print   Pass to program Description
SIGABRT       Yes       Yes     No              Aborted
(gdb) start
Temporary breakpoint 1 at 0x4017ac: file testprogram, line 184.
Starting program: /foo/bar/testprogram
[Thread debugging using libthread_db enabled]
[New Thread 0x7fffed9da700 (LWP 53014)]
[New Thread 0x7fffed1d9700 (LWP 53015)]
LTTng-UST: Error (-17) while registering tracepoint probe. Duplicate
registration of tracepoint probes having the same name is not allowed.

Program received signal SIGABRT, Aborted.
0x00007ffff24b8925 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
CUnit-2.1.2-6.el6.x86_64 boost-system-1.41.0-18.el6.x86_64
boost-thread-1.41.0-18.el6.x86_64 cassandra-cpp-driver-2.0.1-1.el6.amd64
glibc-2.12-1.132.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64
krb5-libs-1.10.3-15.el6_5.1.x86_64 libcom_err-1.41.12-18.el6.x86_64
libgcc-4.4.7-4.el6.x86_64 librados2-0.94.3-0.el6.x86_64
libradosstriper1-0.94.3-0.el6.x86_64
libselinux-2.0.94-5.3.el6_4.1.x86_64 libstdc++-4.4.7-4.el6.x86_64
libuuid-2.17.2-12.14.el6.x86_64 libuv-1.2.1-1.el6.x86_64
lttng-ust-2.4.1-1.el6.x86_64 nspr-4.10.2-1.el6_5.x86_64
nss-3.15.3-6.el6_5.x86_64 nss-util-3.15.3-1.el6_5.x86_64
openssl-1.0.1e-16.el6_5.7.x86_64 userspace-rcu-0.7.7-1.el6.x86_64
zlib-1.2.3-29.el6.x86_64
(gdb) backtrace
#0  0x00007ffff24b8925 in raise () from /lib64/libc.so.6
#1  0x00007ffff24ba105 in abort () from /lib64/libc.so.6
#2  0x00007ffff58c58f4 in ?? () from /usr/lib64/librados.so.2
#3  0x00007ffff58f4936 in ?? () from /usr/lib64/librados.so.2
#4  0x00007fffffffe9a8 in ?? ()
#5  0x0000000000000001 in ?? ()
#6  0x00007fffffffe9a8 in ?? ()
#7  0x00007ffff555f51b in _init () from /usr/lib64/librados.so.2
#8  0x00007ffff7fea000 in ?? ()
#9  0x00007ffff7deb555 in _dl_init_internal () from
/lib64/ld-linux-x86-64.so.2
#10 0x00007ffff7dddb3a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#11 0x0000000000000001 in ?? ()
#12 0x00007fffffffec44 in ?? ()
#13 0x0000000000000000 in ?? ()



This didn't tell me much. I tried using "nm" on the librados and
libradosstriper libraries and there was no symbol information.


I also tried strace which revealed two sub processes

$ grep "/dev/shm/lttng-ust" strace.out
[pid 49682] open("/dev/shm/lttng-ust-wait-5",
O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3
[pid 49683] open("/dev/shm/lttng-ust-wait-5-2489",
O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3



$ grep "pid 49682" strace.out | more
[pid 49682] set_robust_list(0x7fe69cb5b9e0, 0x18 <unfinished ...>
[pid 49682] <... set_robust_list resumed> ) = 0
[pid 49682] socket(PF_FILE, SOCK_STREAM, 0 <unfinished ...>
[pid 49682] <... socket resumed> )      = 3
[pid 49682] fcntl(3, F_SETFD, FD_CLOEXECProcess 49683 attached
[pid 49682] connect(3, {sa_family=AF_FILE,
path="/var/run/lttng/lttng-ust-sock-5"}, 110 <unfinished ...>
[pid 49682] <... connect resumed> )     = -1 ENOENT (No such file or
directory)
[pid 49682] close(3 <unfinished ...>
[pid 49682] <... close resumed> )       = 0
[pid 49682] statfs("/dev/shm/",  <unfinished ...>
[pid 49682] <... statfs resumed> {f_type=0x1021994, f_bsize=4096,
f_blocks=8242437, f_bfree=8242435, f_bavail=8242435, f_files=8242437,
f_ffree=8242434, f_fsid={0, 0}, f_n
amelen=255, f_frsize=4096}) = 0
[pid 49682] futex(0x7fe69fd6b300, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid 49682] open("/dev/shm/lttng-ust-wait-5",
O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3
[pid 49682] fcntl(3, F_GETFD)           = 0x1 (flags FD_CLOEXEC)
[pid 49682] read(3, "\0\0\0\0", 4)      = 4
[pid 49682] mmap(NULL, 4096, PROT_READ, MAP_SHARED, 3, 0) = 0x7fe6a717b000
[pid 49682] close(3)                    = 0
[pid 49682] futex(0x7fe6a13f15e0, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 49682] futex(0x7fe6a717b000, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 49682] +++ killed by SIGABRT (core dumped) +++


$ grep "pid 49683" strace.out | more
[pid 49683] set_robust_list(0x7fe69c35a9e0, 0x18 <unfinished ...>
[pid 49683] <... set_robust_list resumed> ) = 0
[pid 49683] socket(PF_FILE, SOCK_STREAM, 0 <unfinished ...>
[pid 49683] <... socket resumed> )      = 4
[pid 49683] fcntl(4, F_SETFD, FD_CLOEXEC) = 0
[pid 49683] connect(4, {sa_family=AF_FILE,
path="/p4/.lttng/lttng-ust-sock-5"}, 110 <unfinished ...>
[pid 49683] <... connect resumed> )     = -1 ENOENT (No such file or
directory)
[pid 49683] close(4 <unfinished ...>
[pid 49683] <... close resumed> )       = 0
[pid 49683] futex(0x7fe6a13f15e0, FUTEX_WAIT_PRIVATE, 2, NULL
<unfinished ...>
[pid 49683] <... futex resumed> )       = 0
[pid 49683] futex(0x7fe6a13f1620, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 49683] futex(0x7fe6a13f15e0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 49683] open("/dev/shm/lttng-ust-wait-5-2489",
O_RDONLY|O_NOFOLLOW|O_CLOEXEC) = 3
[pid 49683] read(3, "\0\0\0\0", 4)      = 4
[pid 49683] fstat(3, {st_mode=S_IFREG|0640, st_size=4096, ...}) = 0
[pid 49683] getuid( <unfinished ...>
[pid 49683] <... getuid resumed> )      = 2489
[pid 49683] mmap(NULL, 4096, PROT_READ, MAP_SHARED, 3, 0) = 0x7fe6a717a000
[pid 49683] close(3)                    = 0
[pid 49683] futex(0x7fe6a13f15e0, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 49683] futex(0x7fe6a717a000, FUTEX_WAIT, 0, NULL <unfinished ...>
[pid 49683] +++ killed by SIGABRT (core dumped) +++



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux