Shane McDonald wrote:
I have run into some strange behaviour involving using the FPU emulation software in the MIPS kernel when trying to handle a divide-by-zero-caused floating point exception. I have come up with a simple test case to demonstrate this problem. -- #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <fenv.h> #include <setjmp.h> void fpe_handler(int); jmp_buf env; main() { double x; feenableexcept( FE_DIVBYZERO ); signal( SIGFPE, fpe_handler ); if ( setjmp( env ) == 0 ) { printf( "About to try calculation\n" ); x = 5.0 / 0.0; printf( "Value is %f\n", x ); } else { printf( "Calculation causes divide by zero\n" ); } } void fpe_handler(int x) { feclearexcept( FE_DIVBYZERO ); longjmp( env, 1 ); } -- The program sets up to generate a SIGFPE when a divide-by-zero occurs, rather than setting the result to infinity. Then, I've created a handler to catch the exception, and the end result is to print out the "Calculation causes divide by zero" message. I have two MIPS-based systems, both running Debian Etch. One of the systems is a PMC-Sierra RM7035C-based system, which includes an FPU. My other system is a PMC-Sierra MSP7120-based system, which does not include an FPU. The RM7035C system is running the 2.6.34-rc6 kernel, but the MSP7120 system is running 2.6.28. When I run this program on the system with the FPU, I see the results that I expect to see. The program outputs: About to try calculation Calculation causes divide by zero I see the same results when I run the program on an x86 Debian Etch system. When I run the program on the system without the FPU, I see: About to try calculation Floating point exception So, it appears that the floating point exception is not caught. However, when I run strace, the last few lines of output are: old_mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ace9000 write(1, "About to try calculation\n"..., 25About to try calculation ) = 25 --- SIGFPE (Floating point exception) @ 0 (0) --- --- SIGFPE (Floating point exception) @ 0 (0) --- +++ killed by SIGFPE +++ Running it on the system with the FPU, I see: old_mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ace5000 write(1, "About to try calculation\n"..., 25About to try calculation ) = 25 --- SIGFPE (Floating point exception) @ 0 (0) --- write(1, "Calculation causes divide by zero"..., 34Calculation causes divide by zero ) = 34 exit_group(34) = ? After poking around for a while, and trying to account for differences between the systems (endianness, FPUness, kernel version), I believe the problem is related to the lack of FPU. If I run the RM7035C with a disabled FPU (kernel parameter nofpu), I see the same results as on the FPU-less MSP7120. So, I suspect this difference in behaviour is caused by the FPU emulation software. Now, I don't know if this is a problem, but it does seem strange. My level of understanding of the FPU emulation software is very low, so I'm not quite sure where to look. This isn't actually something that I typically do. I noticed this problem when trying to understand why the Debian package "yorick" failed to build (see http://lists.debian.org/debian-mips/2010/04/msg00019.html). I'd appreciate any insight that anyone can provide. Thanks! Shane McDonald