Sorry about my previous message having escaped with no value added.
I think you need to look at just what it is that your feclearexcept()
does. From the strace information, it looks as if it may be that the
FPU emulator is erroneously throwing an exception in response to some
manipulation of the emulated FPU registers by feclearexcept(), so that
it's taking a second FP exception within the signal handler. That's the
simplest explanation for the macroscopic behavior, anyway.
Regards,
Kevin K.
Shane McDonald wrote:
I have run into some strange behaviour involving using the FPU
emulation software in the MIPS kernel when trying to handle
a divide-by-zero-caused floating point exception.
I have come up with a simple test case to demonstrate this problem.
--
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <fenv.h>
#include <setjmp.h>
void fpe_handler(int);
jmp_buf env;
main()
{
double x;
feenableexcept( FE_DIVBYZERO );
signal( SIGFPE, fpe_handler );
if ( setjmp( env ) == 0 )
{
printf( "About to try calculation\n" );
x = 5.0 / 0.0;
printf( "Value is %f\n", x );
}
else
{
printf( "Calculation causes divide by zero\n" );
}
}
void fpe_handler(int x)
{
feclearexcept( FE_DIVBYZERO );
longjmp( env, 1 );
}
--
The program sets up to generate a SIGFPE when a divide-by-zero occurs,
rather than setting the result to infinity. Then, I've created a
handler to catch the exception, and the end result is to print out
the "Calculation causes divide by zero" message.
I have two MIPS-based systems, both running Debian Etch. One of the
systems is a PMC-Sierra RM7035C-based system, which includes an FPU. My
other system is a PMC-Sierra MSP7120-based system, which does not
include an FPU. The RM7035C system is running the 2.6.34-rc6 kernel,
but the MSP7120 system is running 2.6.28.
When I run this program on the system with the FPU, I see the results
that I expect to see. The program outputs:
About to try calculation
Calculation causes divide by zero
I see the same results when I run the program on an x86 Debian Etch system.
When I run the program on the system without the FPU, I see:
About to try calculation
Floating point exception
So, it appears that the floating point exception is not caught.
However, when I run strace, the last few lines of output are:
old_mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ace9000
write(1, "About to try calculation\n"..., 25About to try calculation
) = 25
--- SIGFPE (Floating point exception) @ 0 (0) ---
--- SIGFPE (Floating point exception) @ 0 (0) ---
+++ killed by SIGFPE +++
Running it on the system with the FPU, I see:
old_mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ace5000
write(1, "About to try calculation\n"..., 25About to try calculation
) = 25
--- SIGFPE (Floating point exception) @ 0 (0) ---
write(1, "Calculation causes divide by zero"..., 34Calculation causes divide by zero
) = 34
exit_group(34) = ?
After poking around for a while, and trying to account for differences
between the systems (endianness, FPUness, kernel version), I believe the
problem is related to the lack of FPU. If I run the RM7035C with a
disabled FPU (kernel parameter nofpu), I see the same results as on
the FPU-less MSP7120. So, I suspect this difference in behaviour
is caused by the FPU emulation software.
Now, I don't know if this is a problem, but it does seem strange.
My level of understanding of the FPU emulation software is very low,
so I'm not quite sure where to look.
This isn't actually something that I typically do. I noticed this
problem when trying to understand why the Debian package "yorick"
failed to build (see
http://lists.debian.org/debian-mips/2010/04/msg00019.html).
I'd appreciate any insight that anyone can provide. Thanks!
Shane McDonald