[Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi:
  The original problem was that some users wanted the cmdline option
-ffast-math not to act on intrinsic production code. .i.e for codes
like

#include<immintrin.h>
__m256d
foo2 (__m256d a, __m256d b, __m256d c, __m256d d)
{
__m256d tmp = _mm256_add_pd (a, b);
tmp = _mm256_sub_pd (tmp, c);
tmp = _mm256_sub_pd (tmp, d);
return tmp;
}

compiled with -O2 -mavx2 -ffast-math, users expected codes generated like

vaddpd ymm0, ymm0, ymm1
vsubpd ymm0, ymm0, ymm2
vsubpd ymm0, ymm0, ymm3

but not

vsubpd ymm1, ymm1, ymm2
vsubpd ymm0, ymm0, ymm3
vaddpd ymm0, ymm1, ymm0


For the LLVM side, there're mechanisms like
#pragma float_control( precise, on, push)
...(intrinsics definition)..
#pragma float_control(pop)

When intrinsics are inlined, their IRs will be marked with
"no-fast-math", and even if the caller is compiled with -ffast-math,
reassociation only happens to those IRs which are not marked with
"no-fast-math". It seems to be more flexible to support fast math
control of a region(inside a function).

Does GCC have a similar mechanism?


-- 
BR,
Hongtao



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux