Re: [AArch64][Spec2017]Question about mlow-precision-div optimization.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

> These data I presented is acquired from a cortex-a57 CPU.

>The point that you mentioned in some modern CPU, fdiv is faster than the reciprocal 
> approximation is a new aspect I haven’t come cross.

Well on Cortex-A57 division is also faster, eg. lbm_r is ~3% slower using reciprocal divide.

> And do you think it worth us providing a parameter to alter the iteration so that the
> accuracy can be a trade-off of speed.

What do you mean? We already have -mlow-precision-div (and -sqrt/-recip-sqrt).

> Since spec2017 does result check and will give a test report which indicates miscomputed cases, 
> I suppose the performance improvement is valid.

Try perf stat to show instruction counts, and if they are not increasing due to the extra reciprocal
operations, the benchmark is running incorrectly even if it passes basic checks.

Cheers,
Wilco



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux