Blair Barnett wrote:
Greetings, We have a multi-threaded application that runs on an embedded Linux platform. In the past, the application was built with a 3.3.2 GNU gcc compiler and ran on a PXA-based system running Linux 2.4.27. We built a SEGV signal handler that we armed in each thread's main() function. This signal handler simply called the libc backtrace() function. This scenario worked very well for our device beta test by non-technical users. Whenever the system crashed via segv we squirreled away the information on the device and retrieved it at convenient times, namely when the device was connected to our server over a wireless network. Now, we're moving to a new platform, although it's still ARM-based, running Linux 2.6.21. Our application has been ported to GNU GCC 4.1.1. What we have discovered is that backtrace() doesn't work under 4.1.1, so we can't do the same thing we did under 3.3.2. We found two patches to 4.1.1: http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00357.html and http://www.ecos.sourceware.org/ml/libc-ports/2006-06/msg00045.html When we applied these patches to gcc 4.1.1, backtrace() tended to work a little better, but what we've noticed is that the backtrace is still inconsistent at unwinding the frame pointer, and the registers and program context tended to be those from the signal handler, which was NOT the case under 3.3.2. We need a way to fix this problem, if a solution exists. One solution we tried was to build our own backtrace() function showed inconsistent results, and it would be much better if we leveraged the libc backtrace function. Another solution was to fork gdb in batch mode in the signal handler, instead of calling backtrace(). This solution was also sub-optimal, producing inconsistent backtraces and, in addition, this requires gdb on the embedded device, which is space inefficient. I've included a test program (not muti-threaded) that we have used to test various solutions. If you have a solution to this problem, please let us know. If you need any more information, please let us know.
Perhaps the patches related to this message would be of interest: http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01388.html