On Fri, 28 May 2021, Ho, Lennox via Gcc-help wrote: > > You need to pass -ftls-model=initial-exec when compiling libtls_import.so, not > > the exporting library. I am a bit surprised you've decided to do that the other > > way around :) > > Ahh ok so it does appear that I really misunderstood how static TLS works :/. > > My assumption has been that since (static) TLS variables are placed in the > static TLS segment, at a constant offset (to the left) from the thread pointer > (as shown here > https://th.bing.com/th/id/R92210a8cb7df88cb1b9586ce73d8da90?rik=FX6vqTs2uG%2brzw&pid=ImgRaw), > that initial-exec/local-exec is an imperative on the TLS data itself, not on > the act of *accessing* said data. On one hand, yes, it's a run-time property of the TLS symbol. On the other hand, the compiler selects the most efficient code to access a TLS variable based on what it knows about its location. By passing -ftls-model=initial-exec you're promising to the compiler that each TLS symbol will be in the static TLS block. > While tagging the DSO that accesses the TLS data (libtls_export.so) with > DF_STATIC_TLS will do the trick (libtls_export.so must be loaded before > libtls_import.so - which must be loaded before main() - and so the TLS in > libtls_export.so can be placed in a static TLS section), surely tagging the > export DSO (libtls_export.so) with DF_STATIC_TLS instead would be more > natural? Well, it's not clear. For the exporting module there's no difference whether its TLS definitions are supposed to be in the static block or not. All the difference is on the importing modules' side: one module could have general-dynamic references via tls_get_addr, and the other could have efficient initial-exec references. Either module could be safely dlopen'ed provided that the defining module was loaded at program startup. > My understanding is that the dynamic loader will have the opportunity to fixup > the offsets in any future DSO that is loaded. But the dynamic loader does not edit the code (only the addresses in the GOT table and other writable areas). > Do you mind elaborating why DF_STATIC_TLS is placed on the client/import DSO > and not the export DSO? It's just for information. It somewhat matters when the DSO both defines and references its own TLS data, but in your scenario it's moot. It's placed based on the code performing the accesses (i.e. relocation kinds). > Is there something that I'm still missing? > > ------------------- > > To offer some additional context, I have an unfair reader-writer mutex (unfair > in that readers are heavily favoured) implementation that uses TLS variables > to indicate whether a thread is currently "read-locking". There are a few > other bit and pieces required to make this work, but the advantage of this > approach is we completely eliminate cache contention (typical spin-locks have > the problem of threads trying to out invalidate each other's cache!) for > readers. > > Now, the way I'm deploying this TLS variable is by exporting it from a "core" > DSO that I know will never be dlopened - it will always be loaded before main. > Code from other DSOs - which may be dlopened at arbitrary points - need to use > this TLS variable to perform "read-locking". I would like to avoid the cost > of __tls_get_addr, but at the same time I don't want to force client DSOs to > build with -ftls-model=initial-exec (that could prevent them from being > dlopened). I would also like to avoid retrieving this TLS variable through a > function call. It needs to be: > > mov %fs:0, rbx > mov <offset patched by loader>, rcx > mov rbx[rcx], rax // rax holds the value of the TLS > > Surely this is achievable? Just use the attribute on the specific variable as mentioned in my previous email. Alexander