On 18/08/2018 16:20, Willem Jan Withagen wrote:
On 18/08/2018 14:46, Willem Jan Withagen wrote:
Hi,
I've have upgraded to FreeBSD ALPHA 12.0, but I don't think the errors
them from there. Although they could be in one of the libs that came
along with the upgrade.
I'm getting these warnings during rbd and ceph (maybe even more)
invocations that indicate that indicate a possible problem because:
===
It could be possible that a dynamically loaded library, use
thread_local variable but is dlclose()'d before thread exit. The
destructor of this variable will then try to access the address,
for calling it but it's unloaded, so it'll crash. We're using
__elf_phdr_match_addr() to detect and prevent such cases and so
prevent the crash.
===
this is from :
https://github.com/freebsd/freebsd/blob/master/lib/libc/stdlib/cxa_thread_atexit_impl.c
Now it could be that dlclose() and thread exit are just closed to one
another. But still this is hard core embedded in libc already since
2017, so I'm sort of expecting that a recent change has caused this.
And as indicated it is a possible cause for crashed, because
thread_exit is going to clean up things that are no longer there.
Now the 20 dollar question is:
Where was this introduced??
Otherwise I'll have to try and throw my best gdb capabilities at it,
and try to invoke an rbd call and see where it activates this warning.
Debugging foo was rather simple to find the dtor with a problem:
__cxa_thread_call_dtors: dtr 0x80c9e1bc0 from unloaded dso, skipping
cxa_thread_walk (cb=<optimized out>) at
/usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:129
129 free(dtor);
(gdb) info symbol 0x80c9e1bc0
std::__1::random_device::~random_device() in section .text of
/usr/lib/libc++.so.1
And this is during process exit:
#0 cxa_thread_walk (cb=<optimized out>) at
/usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:129
#1 __cxa_thread_call_dtors () at
/usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:144
#2 0x000000080cbdfb9a in exit (status=45) at
/usr/srcs/head/src/lib/libc/stdlib/exit.c:73
#3 0x000000000060a09c in _start (ap=<optimized out>, cleanup=<optimized
out>) at /usr/srcs/head/src/lib/csu/amd64/crt1.c:74
So I guess that it could be about any where where random() is used?
BTW: I have the same issue on jenkins build for mimic
Again more about this issue, and it seems there is a substantial
difference between Linux and FreeBSD in managing opened dynamic libraries:
On 26/08/2018 12:19, David Chisnall wrote:
The FreeBSD implementation here looks racy. If one thread dlcloses an
object while another thread is exiting, we can end up calling a function
at an invalid memory address. It also looks as if it may be possible to
unload one library, load another at the same address, and end up
executing entirely the wrong code, which would have some serious
security implications.
The GNU/Linux equivalent of this function locks the DSO in memory until
all references to it have gone away. A call to dlclose() on GNU/Linux
will not actually unload the library until all threads with destructors
in that library have been unloaded. I believe that this reuses the same
reference counting mechanism that allows the same library to be dlopened
and dlclosed multiple times.
It would be nice if the FreeBSD version had the same behaviour, because
this is almost certainly expected in code written on other platforms.
===========
So would this be a correct assumption, and that what I see is because
the ceph project actually uses this feature of the Linux DL-implementation?
--WjW