Re: Options for using hardware implementation of remainder and square root on i32

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kwok, Yipkei wrote:
Although the Intel x86 processors are capable of computing remainder and square root operations in hardware, both g++ and icpc (Intel compiler) realize these operations in software [1].

Question 1: Is there any specific result why they do it this way (using software implementation instead of hardware)?
The usual way to implement sqrt, when optimizing, would be to execute the sqrt instruction, check the processor flags, and retry with the library function if any flag is set which would indicate a requirement for errno processing or exception handling.

I don't know the answer as to whether C99 remainder functions are commonly available in compilers when -std=c99 is set, and what g++ and icpc may do to permit their use as an extension beyond C++ standard. Assuming that they have to be implemented with x87 code, while one would normally be using SSE code generation options, the performance implications become such that the code size vs performance tradeoff dictates a library function call. When you have performance or accuracy critical code requiring remaindering, you probably have to study the facilities of the target platform.

Question 2: Is there any compiler option that force these operations to be done in hardware?

In order to support Fortran, gcc has a specific flag to turn off the sqrt retry, thus breaking errno and exception handling. Maybe g++ -ffast-math or icpc -fp-model fast=2 might include such an option. Those are big hammers which evidently aren't suitable for general use.
I need these answers in order compare differences, if there is any, in terms of performance and results.
If you want strictly correct results, according to IEEE754, you must consider whether you want your operations to be widened, e.g. according to x87 precision setting, and, with icpc, you must set -prec-sqrt. This is not a question of whether you have an in-line sqrt instruction, but will destroy your conclusions if you ignore the issues. There is a clear performance hit in supporting errno and exception handling, which you normally expect to incur in C++. You have the option of writing in-line asm code, if that is what you want to evaluate.

I imagine it's difficult to get people excited nowadays about something specific to 32-bit compilers for a specific CPU architecture, when there are no longer any 32-bit CPUs in production.


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux