Re: Options for using hardware implementation of remainder and square root on i32
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Kwok, Yipkei wrote:
Although the Intel x86 processors are capable of computing remainder
and square root operations in hardware, both g++ and icpc (Intel
compiler) realize these operations in software [1].
Question 1: Is there any specific result why they do it this way
(using software implementation instead of hardware)?
The usual way to implement sqrt, when optimizing, would be to execute
the sqrt instruction, check the processor flags, and retry with the
library function if any flag is set which would indicate a requirement
for errno processing or exception handling.
I don't know the answer as to whether C99 remainder functions are
commonly available in compilers when -std=c99 is set, and what g++ and
icpc may do to permit their use as an extension beyond C++ standard.
Assuming that they have to be implemented with x87 code, while one would
normally be using SSE code generation options, the performance
implications become such that the code size vs performance tradeoff
dictates a library function call.
When you have performance or accuracy critical code requiring
remaindering, you probably have to study the facilities of the target
platform.
Question 2: Is there any compiler option that force these operations
to be done in hardware?
In order to support Fortran, gcc has a specific flag to turn off the
sqrt retry, thus breaking errno and exception handling. Maybe g++
-ffast-math or icpc -fp-model fast=2 might include such an option.
Those are big hammers which evidently aren't suitable for general use.
I need these answers in order compare differences, if there is any, in
terms of performance and results.
If you want strictly correct results, according to IEEE754, you must
consider whether you want your operations to be widened, e.g. according
to x87 precision setting, and, with icpc, you must set -prec-sqrt. This
is not a question of whether you have an in-line sqrt instruction, but
will destroy your conclusions if you ignore the issues. There is a clear
performance hit in supporting errno and exception handling, which you
normally expect to incur in C++.
You have the option of writing in-line asm code, if that is what you
want to evaluate.
I imagine it's difficult to get people excited nowadays about something
specific to 32-bit compilers for a specific CPU architecture, when there
are no longer any 32-bit CPUs in production.
[Index of Archives]
[Linux C Programming]
[Linux Kernel]
[eCos]
[Fedora Development]
[Fedora Announce]
[Autoconf]
[The DWARVES Debugging Tools]
[Yosemite Campsites]
[Yosemite News]
[Linux GCC]