Re: Function multiversioning question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



... sorry, please ignore my previous email!
clang actually ignores the #pragma completely, but in contrast to gcc it
does so silently, so I completely missed that.

On 1/17/19 10:38 AM, Martin Reinecke wrote:
> Hi,
> 
> I'm coming back to this after some experiments. If one compiles the
> attached example with
> 
> gcc -c archtest1.c
> 
> one gets the output
> 
> archtest1.c:4:2: warning: #warning outer file: AVX512F not defined [-Wcpp]
>  #warning outer file: AVX512F not defined
>   ^~~~~~~
> In file included from archtest1.c:8:
> archtest2.c:2:2: warning: #warning inner file: AVX512F defined [-Wcpp]
>  #warning inner file: AVX512F defined
> 
> which seems to contradict what Jonathan said about macros not being
> influenced by the #pragmas.
> 
> However, if I compile the same code with clang, I get
> 
> martin@debian:~/tmp$ clang-7 -c archtest1.c
> archtest1.c:4:2: warning: outer file: AVX512F not defined [-W#warnings]
> #warning outer file: AVX512F not defined
>  ^
> In file included from archtest1.c:8:
> ./archtest2.c:4:2: warning: inner file: AVX512F not defined [-W#warnings]
> #warning inner file: AVX512F not defined
>  ^
> 2 warnings generated.
> 
> So the compilers behave differently, even though clang tries to emulate
> the GCC pragma.
> 
> My question is now: is the fact that gcc defines the __AVX512F__ macro
> in the included file a bug, or is this working as intended?
> 
> Thanks,
>   Martin
> 
> 
> On 10/25/18 2:50 PM, Marc Glisse wrote:
>> On Thu, 25 Oct 2018, Martin Reinecke wrote:
>>
>>> Hi Jonathan,
>>>
>>> thanks for the quick reply!
>>>
>>>> Macros are defined during preprocessing, and the preprocessor doesn't
>>>> know anything about the target_clones attribute. When the compiler
>>>> sees the attribute it can't go back in time and alter the result of
>>>> earlier preprocessing.
>>>
>>> I feared as much.
>>> This creates a nasty asymmetry in the sense that gcc's own optimizations
>>> will be able to use all target features (because the compiler knows that
>>> it is OK to use specific features like AVX instructions) whereas the
>>> user has no way to hand-optimize where this becomes necessary. At least
>>> not using this nice mechanism.
>>>
>>>>> Is there a way to achieve what I have in mind?
>>>>
>>>> If you want three different implementations of the function I think
>>>> you need three different clones. Or do runtime checks for the CPU
>>>> features inside the function, but that seems suboptimal.
>>>
>>> I guess I'll just put all functions in question in a separate file and
>>> compile this with different flags and name prefixes.
>>
>> target_clones does nothing magic, you can also look at target and ifunc.
>> https://gcc.gnu.org/wiki/FunctionMultiVersioning
>>



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux