Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

Hongtao Liu via Gcc-help <gcc-help@xxxxxxxxxxx> · Wed, 14 Jul 2021 13:18:29 +0800



On Wed, Jul 14, 2021 at 1:15 PM Hongtao Liu <crazylht@xxxxxxxxx> wrote:
>
> Hi:
>   The original problem was that some users wanted the cmdline option
> -ffast-math not to act on intrinsic production code. .i.e for codes
> like
>
> #include<immintrin.h>
> __m256d
> foo2 (__m256d a, __m256d b, __m256d c, __m256d d)
> {
> __m256d tmp = _mm256_add_pd (a, b);
> tmp = _mm256_sub_pd (tmp, c);
> tmp = _mm256_sub_pd (tmp, d);
> return tmp;
> }
>
> compiled with -O2 -mavx2 -ffast-math, users expected codes generated like
>
> vaddpd ymm0, ymm0, ymm1
> vsubpd ymm0, ymm0, ymm2
> vsubpd ymm0, ymm0, ymm3
>
> but not
>
> vsubpd ymm1, ymm1, ymm2
> vsubpd ymm0, ymm0, ymm3
> vaddpd ymm0, ymm1, ymm0
>
>
> For the LLVM side, there're mechanisms like
> #pragma float_control( precise, on, push)
> ...(intrinsics definition)..
> #pragma float_control(pop)
>
> When intrinsics are inlined, their IRs will be marked with
> "no-fast-math", and even if the caller is compiled with -ffast-math,
> reassociation only happens to those IRs which are not marked with
> "no-fast-math". It seems to be more flexible to support fast math
> control of a region(inside a function).
Testcase
https://godbolt.org/z/9cYMGGWPG
>
> Does GCC have a similar mechanism?
>
>
> --
> BR,
> Hongtao


-- 
BR,
Hongtao