Hi Tim,
Thank you very much for your promptly response. Please see below.
I really appreciate your help. Thanks again!
Regards,
Yipkei
Tim Prince wrote:
Kwok, Yipkei wrote:
Although the Intel x86 processors are capable of computing remainder
and square root operations in hardware, both g++ and icpc (Intel
compiler) realize these operations in software [1].
Question 1: Is there any specific result why they do it this way
(using software implementation instead of hardware)?
The usual way to implement sqrt, when optimizing, would be to execute
the sqrt instruction, check the processor flags, and retry with the
library function if any flag is set which would indicate a requirement
for errno processing or exception handling.
I don't know the answer as to whether C99 remainder functions are
commonly available in compilers when -std=c99 is set, and what g++ and
icpc may do to permit their use as an extension beyond C++ standard.
Assuming that they have to be implemented with x87 code, while one would
normally be using SSE code generation options, the performance
implications become such that the code size vs performance tradeoff
dictates a library function call.
When you have performance or accuracy critical code requiring
remaindering, you probably have to study the facilities of the target
platform.
I will give c99 is try. Thank you very much for suggesting it.
Question 2: Is there any compiler option that force these operations
to be done in hardware?
In order to support Fortran, gcc has a specific flag to turn off the
sqrt retry, thus breaking errno and exception handling. Maybe g++
-ffast-math or icpc -fp-model fast=2 might include such an option.
Those are big hammers which evidently aren't suitable for general use.
Does it mean that no executable of any Fortran program compiled with
gfortran is able to execute the sqrt hardware instruction as, from your
description above, it takes the sqrt retry flag for errno processing and
exception handling?
I need these answers in order compare differences, if there is any, in
terms of performance and results.
If you want strictly correct results, according to IEEE754, you must
consider whether you want your operations to be widened, e.g. according
to x87 precision setting, and, with icpc, you must set -prec-sqrt. This
is not a question of whether you have an in-line sqrt instruction, but
will destroy your conclusions if you ignore the issues. There is a clear
performance hit in supporting errno and exception handling, which you
normally expect to incur in C++.
You have the option of writing in-line asm code, if that is what you
want to evaluate.
I imagine it's difficult to get people excited nowadays about something
specific to 32-bit compilers for a specific CPU architecture, when there
are no longer any 32-bit CPUs in production.
I begin this adventure with the 32-bit systems and I will continue on
the 64-bit systems. If anyone addresses this issue on the 64-bit
systems, I would surely love to hear that.
--
Join the We Campaign! (http://www.wecansolveit.org/)
********************************************
Yipkei Kwok
Ph.D. Student
Research Assistant
HiPerSys Lab
Department of Computer Science
The University of Texas at El Paso
Phone: 915 747 6433 (O)
E-mail: ykwok2 at miners dot utep dot edu
********************************************