Re: Function multiversioning question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 25 Oct 2018 at 12:46, Martin Reinecke
<martin@xxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> I'm trying to use gcc's "target_clones" attribute for some functions in
> a performance critical library. These functions use gcc builtins and
> choose between different sets (standard code, SSE2, AVX) depending on
> the predefined macros __SSE2__ and __AVX__.
> Unfortunately these macros apparently are not set by the compiler when
> it compiles for the individual targets.
>
> Consider the code below:
>
> #include <stdio.h>
>
> __attribute__((target_clones("avx","sse2","default")))
> void foo(void)
>   {
> #if defined(__AVX__)
>   printf("AVX\n");
> #elif defined(__SSE2__)
>   printf("SSE2\n");
> #else
>   printf("nothing special\n");
> #endif
>   }
>
> int main(void)
>   {
>   foo();
>   return 0;
>   }
>
> Compiling and running this in an AVX-capable CPU prints "SSE2", where I
> would have hoped to see "AVX".

Macros are defined during preprocessing, and the preprocessor doesn't
know anything about the target_clones attribute. When the compiler
sees the attribute it can't go back in time and alter the result of
earlier preprocessing.

> Is there a way to achieve what I have in mind?

If you want three different implementations of the function I think
you need three different clones. Or do runtime checks for the CPU
features inside the function, but that seems suboptimal.



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux