Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 14, 2021 at 1:15 PM Hongtao Liu <crazylht@xxxxxxxxx> wrote:
>
> Hi:
>   The original problem was that some users wanted the cmdline option
> -ffast-math not to act on intrinsic production code. .i.e for codes
> like
>
> #include<immintrin.h>
> __m256d
> foo2 (__m256d a, __m256d b, __m256d c, __m256d d)
> {
> __m256d tmp = _mm256_add_pd (a, b);
> tmp = _mm256_sub_pd (tmp, c);
> tmp = _mm256_sub_pd (tmp, d);
> return tmp;
> }
>
> compiled with -O2 -mavx2 -ffast-math, users expected codes generated like
>
> vaddpd ymm0, ymm0, ymm1
> vsubpd ymm0, ymm0, ymm2
> vsubpd ymm0, ymm0, ymm3
>
> but not
>
> vsubpd ymm1, ymm1, ymm2
> vsubpd ymm0, ymm0, ymm3
> vaddpd ymm0, ymm1, ymm0
>
>
> For the LLVM side, there're mechanisms like
> #pragma float_control( precise, on, push)
> ...(intrinsics definition)..
> #pragma float_control(pop)
>
> When intrinsics are inlined, their IRs will be marked with
> "no-fast-math", and even if the caller is compiled with -ffast-math,
> reassociation only happens to those IRs which are not marked with
> "no-fast-math". It seems to be more flexible to support fast math
> control of a region(inside a function).
Testcase
https://godbolt.org/z/9cYMGGWPG
>
> Does GCC have a similar mechanism?
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux