Re: [RFC PATCH v4 1/5] glibc: Perform rseq(2) registration at nptl init and thread creation

Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> · Mon, 26 Nov 2018 14:22:05 -0500 (EST)

----- On Nov 26, 2018, at 12:10 PM, Rich Felker dalias@xxxxxxxx wrote:

> On Mon, Nov 26, 2018 at 05:03:02PM +0100, Florian Weimer wrote:
>> * Mathieu Desnoyers:
>> 
>> > So let's make __rseq_abi and __rseq_refcount strong symbols then ?
>> 
>> Yes, please.  (But I'm still not sure we need the reference counter.)
> 
> The reference counter is needed for out-of-libc implementations
> interacting and using the dtor hack. An in-libc implementation doesn't
> need to inspect/honor the reference counter, but it does seem to need
> to indicate that it has a reference, if you want it to be compatible
> with out-of-libc implementations, so that the out-of-libc one will not
> unregister the rseq before libc is done with it.

Let's consider two use-cases here: one (simpler) is use of rseq TLS
from thread context by out-of-libc implementations. The other is use of
rseq TLS from signal handler by out-of-libc implementations.

If we only care about users of rseq from thread context, then libc
could simply set the refcount value to 1 on thread start,
and should not care about the value on thread exit. The libc can
either directly call rseq unregister, or rely on thread calling exit
to implicitly unregister rseq, which depends on its own TLS life-time
guarantees. For instance, if the IE-model TLS is valid up until call
to exit, just calling the exit system call is fine. However, if a libc
has a window at thread exit during which the kernel can preempt the
thread with the IE-model TLS area being already reclaimed, then it
needs to explicitly call rseq unregister before freeing the TLS.

The second use-case is out-of-libc implementations using rseq from
signal handler. This one is trickier. First, pthread_key setspecific
is unfortunately not async-signal-safe. I can't find a good way to
seamlessly integrate rseq into out-of-libc signal handlers while
performing lazy registration without races on thread exit. If we
figure out a way to do this though, we should increment the refcount
at thread start in libc (rather than just set it to 1) in case a
signal handler gets nested immediately over the start of the thread
and registers rseq as well.

It looks like it's not the only issue I have with calling lttng-ust
instrumentation from signal handlers, here is the list I have so
far:

* glibc global-dynamic TLS variables are not async-signal-safe,
  and lttng-ust cannot use IE-model TLS because it is meant to be
  dlopen'd,
* pthread_setspecific is not async-signal-safe,

There should be ways to eventually solve those issues, but it would
be nice if for now the way rseq is implemented in libc does not add
yet another limitation for signal handlers.

> 
> Alternatively another protocol could be chosen for this purpose, but
> if has to be something stable and agreed upon, since things would
> break badly if libc and the library providing rseq disagreed.

Absolutely. We need to agree on that protocol before user-space
applications/libraries start using rseq.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com