Re: Warnings about dlclose before thread exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/08/2018 16:20, Willem Jan Withagen wrote:
On 18/08/2018 14:46, Willem Jan Withagen wrote:
Hi,

I've have upgraded to FreeBSD ALPHA 12.0, but I don't think the errors them from there. Although they could be in one of the libs that came along with the upgrade.

I'm getting these warnings during rbd and ceph (maybe even more) invocations that indicate that indicate a possible problem because:
===
    It could be possible that a dynamically loaded library, use
    thread_local variable but is dlclose()'d before thread exit.  The
    destructor of this variable will then try to access the address,
    for calling it but it's unloaded, so it'll crash.  We're using
    __elf_phdr_match_addr() to detect and prevent such cases and so
    prevent the crash.
===
this is from : https://github.com/freebsd/freebsd/blob/master/lib/libc/stdlib/cxa_thread_atexit_impl.c

Now it could be that dlclose() and thread exit are just closed to one another. But still this is hard core embedded in libc already since 2017, so I'm sort of expecting that a recent change has caused this.

And as indicated it is a possible cause for crashed, because thread_exit is going to clean up things that are no longer there.

Now the 20 dollar question is:
     Where was this introduced??

Otherwise I'll have to try and throw my best gdb capabilities at it, and try to invoke an rbd call and see where it activates this warning.

Debugging foo was rather simple to find the dtor with a problem:

__cxa_thread_call_dtors: dtr 0x80c9e1bc0 from unloaded dso, skipping
cxa_thread_walk (cb=<optimized out>) at /usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:129
129                     free(dtor);
(gdb) info symbol 0x80c9e1bc0
std::__1::random_device::~random_device() in section .text of /usr/lib/libc++.so.1

And this is during process exit:
#0  cxa_thread_walk (cb=<optimized out>) at /usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:129 #1  __cxa_thread_call_dtors () at /usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:144 #2  0x000000080cbdfb9a in exit (status=45) at /usr/srcs/head/src/lib/libc/stdlib/exit.c:73 #3  0x000000000060a09c in _start (ap=<optimized out>, cleanup=<optimized out>) at /usr/srcs/head/src/lib/csu/amd64/crt1.c:74

So I guess that it could be about any where where random() is used?

BTW: I have the same issue on jenkins build for mimic

Again more about this issue, and it seems there is a substantial difference between Linux and FreeBSD in managing opened dynamic libraries:

On 26/08/2018 12:19, David Chisnall wrote:
The FreeBSD implementation here looks racy. If one thread dlcloses an object while another thread is exiting, we can end up calling a function at an invalid memory address. It also looks as if it may be possible to unload one library, load another at the same address, and end up executing entirely the wrong code, which would have some serious security implications.

The GNU/Linux equivalent of this function locks the DSO in memory until all references to it have gone away. A call to dlclose() on GNU/Linux will not actually unload the library until all threads with destructors in that library have been unloaded. I believe that this reuses the same reference counting mechanism that allows the same library to be dlopened and dlclosed multiple times.

It would be nice if the FreeBSD version had the same behaviour, because this is almost certainly expected in code written on other platforms.
===========

So would this be a correct assumption, and that what I see is because the ceph project actually uses this feature of the Linux DL-implementation?

--WjW




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux