Re: Warnings about dlclose before thread exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 3, 2018 at 4:28 PM Willem Jan Withagen <wjw@xxxxxxxxxxx> wrote:
>
> On 18/08/2018 16:20, Willem Jan Withagen wrote:
> > On 18/08/2018 14:46, Willem Jan Withagen wrote:
> >> Hi,
> >>
> >> I've have upgraded to FreeBSD ALPHA 12.0, but I don't think the errors
> >> them from there. Although they could be in one of the libs that came
> >> along with the upgrade.
> >>
> >> I'm getting these warnings during rbd and ceph (maybe even more)
> >> invocations that indicate that indicate a possible problem because:
> >> ===
> >>     It could be possible that a dynamically loaded library, use
> >>     thread_local variable but is dlclose()'d before thread exit.  The
> >>     destructor of this variable will then try to access the address,
> >>     for calling it but it's unloaded, so it'll crash.  We're using
> >>     __elf_phdr_match_addr() to detect and prevent such cases and so
> >>     prevent the crash.
> >> ===
> >> this is from :
> >> https://github.com/freebsd/freebsd/blob/master/lib/libc/stdlib/cxa_thread_atexit_impl.c
> >>
> >>
> >> Now it could be that dlclose() and thread exit are just closed to one
> >> another. But still this is hard core embedded in libc already since
> >> 2017, so I'm sort of expecting that a recent change has caused this.
> >>
> >> And as indicated it is a possible cause for crashed, because
> >> thread_exit is going to clean up things that are no longer there.
> >>
> >> Now the 20 dollar question is:
> >>      Where was this introduced??
> >>
> >> Otherwise I'll have to try and throw my best gdb capabilities at it,
> >> and try to invoke an rbd call and see where it activates this warning.
> >
> > Debugging foo was rather simple to find the dtor with a problem:
> >
> > __cxa_thread_call_dtors: dtr 0x80c9e1bc0 from unloaded dso, skipping
> > cxa_thread_walk (cb=<optimized out>) at
> > /usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:129
> > 129                     free(dtor);
> > (gdb) info symbol 0x80c9e1bc0
> > std::__1::random_device::~random_device() in section .text of
> > /usr/lib/libc++.so.1
> >
> > And this is during process exit:
> > #0  cxa_thread_walk (cb=<optimized out>) at
> > /usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:129
> > #1  __cxa_thread_call_dtors () at
> > /usr/srcs/head/src/lib/libc/stdlib/cxa_thread_atexit_impl.c:144
> > #2  0x000000080cbdfb9a in exit (status=45) at
> > /usr/srcs/head/src/lib/libc/stdlib/exit.c:73
> > #3  0x000000000060a09c in _start (ap=<optimized out>, cleanup=<optimized
> > out>) at /usr/srcs/head/src/lib/csu/amd64/crt1.c:74
> >
> > So I guess that it could be about any where where random() is used?
> >
> > BTW: I have the same issue on jenkins build for mimic
>
> Again more about this issue, and it seems there is a substantial
> difference between Linux and FreeBSD in managing opened dynamic libraries:
>
> On 26/08/2018 12:19, David Chisnall wrote:
> The FreeBSD implementation here looks racy.  If one thread dlcloses an
> object while another thread is exiting, we can end up calling a function
> at an invalid memory address.  It also looks as if it may be possible to
> unload one library, load another at the same address, and end up
> executing entirely the wrong code, which would have some serious
> security implications.
>
> The GNU/Linux equivalent of this function locks the DSO in memory until
> all references to it have gone away.  A call to dlclose() on GNU/Linux
> will not actually unload the library until all threads with destructors
> in that library have been unloaded.  I believe that this reuses the same
> reference counting mechanism that allows the same library to be dlopened
> and dlclosed multiple times.
>
> It would be nice if the FreeBSD version had the same behaviour, because
> this is almost certainly expected in code written on other platforms.

agreed.

> ===========
>
> So would this be a correct assumption, and that what I see is because
> the ceph project actually uses this feature of the Linux DL-implementation?

please refer to
http://sources.freebsd.org/HEAD/src/lib/libc/stdlib/cxa_thread_atexit_impl.c
.

i think FreeBSD's libc is trying to avoid a crash when calling dtor of
"thread_local random_device" when a thread is exiting, but the
instance of "random_device" is living in a shared library
(libceph-comon.so i guess), so when after libceph-common is
dlclose()'ed, the dtor of the random_device's instance is called. and
afterwards, the thread exists and libc tries to call the dtors
registered with __cxa_thread_atexit(), and finds that some of the
dtor(s) have been called. hence it complains. i don't think it should
print this error message at all. as this use case is expected.

so, i don't think we are relying a platform-dependent "feature". what
we rely on is a behavior that makes sense.

>
> --WjW
>


-- 
Regards
Kefu Chai



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux