Re: [RFC PATCH 0/3] Implement getcpu_cache system call

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On January 12, 2016 4:22:29 PM PST, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>----- On Jan 12, 2016, at 4:02 PM, Ben Maurer bmaurer@xxxxxx wrote:
>
>>> One idea I have would be to let the kernel reserve some space either
>after the
>>> first stack address (for a stack growing down) or at the beginning
>of the
>>> allocated TLS area for each thread in copy_thread_tls() by fiddling
>with
>>> sp or the tls base address when creating a thread.
>> 
>> Could this be implemented by having glibc use a well known symbol
>name to define
>> the per-thread TLS area? If an high performance application wants to
>avoid any
>> relocations in accessing this variable it would define it and that
>definition
>> would override glibc's. This is how things work with malloc. glibc
>has a
>> default malloc implementation but we link jemalloc directly into our
>binaries.
>> in addition to changing the malloc implementation this means that
>calls to
>> malloc don't go through the PLT.
>
>Just to make sure I understand your proposal: defining a well known
>symbol
>with a weak attribute in glibc (or bionic...), e.g.:
>
>int32_t __thread __attribute__((weak)) __getcpu_cache;
>
>so that applications which care about bypassing the PLT can override it
>with:
>
>int32_t __thread __getcpu_cache;
>
>glibc/bionic would be responsible for calling the getcpu_cache() system
>call
>to register/unregister this TLS variable for each thread.
>
>One thing I would like to figure out is whether we can use this in a
>way that
>would allow introducing getcpu_cache() into applications and libraries
>(e.g. lttng-ust tracer) before it gets implemented into glibc, in a way
>that
>would keep forward compatibility for whenever it gets introduced in
>glibc.
>
>We can declare __getcpu_cache as a weak symbol in arbitrary libraries,
>and
>make them register/unregister the cache through the getcpu_cache
>syscall.
>The main thing that I would need to tweak at the kernel level within
>the
>system call would be to keep a refcount of the number of times the
>__getcpu_cache is registered per thread. This would allow multiple
>registrations,
>one per library (e.g. lttng-ust) and one for glibc, but we would
>validate
>that they all register the exact same address for a given thread.
>
>The reference counting trick should also work for cases where
>applications
>define a non-weak __getcpu_cache, and want to call the getcpu_cache
>system call to register it themselves (before glibc adds support for
>it).

This seems like something better done in a tiny common library, rather than the kernel or by playing symbol resolution games.


--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux