On 11/10/18 16:13, Mathieu Desnoyers wrote: > ----- On Oct 11, 2018, at 6:37 AM, Szabolcs Nagy Szabolcs.Nagy@xxxxxxx wrote: > >> On 10/10/18 20:19, Mathieu Desnoyers wrote: >>> In order to integrate rseq into user-space applications, add a reference >>> counter field after the struct rseq TLS ABI so many rseq users can be >>> linked into the same application (e.g. librseq and glibc). The >>> reference count ensures that rseq syscall registration/unregistration >>> happens only for the most early/late user for each thread, thus ensuring >>> that rseq is registered across the lifetime of all rseq users for a >>> given thread. >> ... >>> +__attribute__((visibility("hidden"))) __thread >>> +volatile struct libc_rseq __lib_rseq_abi = { >> ... >>> +extern __attribute__((weak, alias("__lib_rseq_abi"))) __thread >>> +volatile struct rseq __rseq_abi; >> ... >>> @@ -70,7 +86,7 @@ int rseq_register_current_thread(void) >>> sigset_t oldset; >>> >>> signal_off_save(&oldset); >>> - if (refcount++) >>> + if (__lib_rseq_abi.refcount++) >>> goto end; >>> rc = sys_rseq(&__rseq_abi, sizeof(struct rseq), 0, RSEQ_SIG); >> >> why do you use a local refcounter instead of the __rseq_abi one? > > There is no refcount in struct rseq (the ABI between kernel and user-space). > The registration refcount was part of an earlier version of the rseq system call, > but we decided against keeping it in the kernel. > > So I'm adding one _after_ struct rseq, purely to allow interaction between > various user-space components (program/libraries). then all those components must use the same rseq_register_current_thread rseq_unregister_current_thread functions and not call the syscall on their own. in which case the refcount could be a static __thread variable. but it's in a magic struct that's called "abi" which is confusing, the counter is not abi, it's in a hidden object. >> what prevents calling rseq_register_current_thread more than 4G times? > > Nothing. It would indeed be cleaner to error out if we detect that refcount is at > INT_MAX. Is that what you have in mind ? yes >> why cant the kernel see that the same address is registered again and succeed? > > It can, and it does. However, refcounting at user-level is needed to ensure > the registration "lifetime" for rseq covers its entire use. If we have two libraries > using rseq, we end up with the following scenario: > > Thread 1 > > libA registers rseq > libB registers rseq > libB unregisters rseq > libA uses rseq -> bug! it's been unregistered by libB. > libA unregisters rseq -> unexpected, it's already been unregistered. > > same applies if libA unregisters rseq before libB (and libB try to use rseq > after libA has unregistered). > > The refcount in user-space fixes this. i see. > Thoughts ? > > Thanks, > > Mathieu >