... sorry, please ignore my previous email! clang actually ignores the #pragma completely, but in contrast to gcc it does so silently, so I completely missed that. On 1/17/19 10:38 AM, Martin Reinecke wrote: > Hi, > > I'm coming back to this after some experiments. If one compiles the > attached example with > > gcc -c archtest1.c > > one gets the output > > archtest1.c:4:2: warning: #warning outer file: AVX512F not defined [-Wcpp] > #warning outer file: AVX512F not defined > ^~~~~~~ > In file included from archtest1.c:8: > archtest2.c:2:2: warning: #warning inner file: AVX512F defined [-Wcpp] > #warning inner file: AVX512F defined > > which seems to contradict what Jonathan said about macros not being > influenced by the #pragmas. > > However, if I compile the same code with clang, I get > > martin@debian:~/tmp$ clang-7 -c archtest1.c > archtest1.c:4:2: warning: outer file: AVX512F not defined [-W#warnings] > #warning outer file: AVX512F not defined > ^ > In file included from archtest1.c:8: > ./archtest2.c:4:2: warning: inner file: AVX512F not defined [-W#warnings] > #warning inner file: AVX512F not defined > ^ > 2 warnings generated. > > So the compilers behave differently, even though clang tries to emulate > the GCC pragma. > > My question is now: is the fact that gcc defines the __AVX512F__ macro > in the included file a bug, or is this working as intended? > > Thanks, > Martin > > > On 10/25/18 2:50 PM, Marc Glisse wrote: >> On Thu, 25 Oct 2018, Martin Reinecke wrote: >> >>> Hi Jonathan, >>> >>> thanks for the quick reply! >>> >>>> Macros are defined during preprocessing, and the preprocessor doesn't >>>> know anything about the target_clones attribute. When the compiler >>>> sees the attribute it can't go back in time and alter the result of >>>> earlier preprocessing. >>> >>> I feared as much. >>> This creates a nasty asymmetry in the sense that gcc's own optimizations >>> will be able to use all target features (because the compiler knows that >>> it is OK to use specific features like AVX instructions) whereas the >>> user has no way to hand-optimize where this becomes necessary. At least >>> not using this nice mechanism. >>> >>>>> Is there a way to achieve what I have in mind? >>>> >>>> If you want three different implementations of the function I think >>>> you need three different clones. Or do runtime checks for the CPU >>>> features inside the function, but that seems suboptimal. >>> >>> I guess I'll just put all functions in question in a separate file and >>> compile this with different flags and name prefixes. >> >> target_clones does nothing magic, you can also look at target and ifunc. >> https://gcc.gnu.org/wiki/FunctionMultiVersioning >>