In order for the _bzhi_u32 intrinsic to be defined when including `x86intrinc.h` I need to compile with `-mbmi2` and that sprinkles other BMI2 instructions all over (for bit shifting, <<). I want to limit the BMI2 to my explicit usage of it via intrinsics because I want to check at runtime if the BMI2 is available and dispatch to a different function in each case (one compiled with BMI2 and explicit intrinsics, and other without) using __attribute__((ifunc(xxx))). I tried to manually #define __BMI2__ before including x86intrin.h and declaring the function __attribute__((target("bmi2"))) for the function, but that gives me a "undefined reference to _bzhi_u32". So far the only thing that worked was to move the BMI2/BZHI function to a separate .c file and compile that file with -mbmi2 or use #pragma GCC target "bmi2" in that file. This alternative works but then this destroys my ability to declare the function as static and inline. Is there any better way to use the BZHI instruction in one static inline function without letting gcc use BMI2 in the rest of the translation unit? I never used inline assembly, is that the only option left? -- /Rubén