On 10 April 2015 at 08:25, Patrick Begou <Patrick.Begou@xxxxxxxxxxxxxxxxxxxx> wrote: > I have also tested -gdwarf-2 option in gcc 4.8.2. I get the same output with > addr2line. > > Patrick > > > Janne Blomqvist wrote: >> >> On Thu, Apr 9, 2015 at 8:48 PM, Toon Moene <toon@xxxxxxxxx> wrote: >>> >>> On 04/09/2015 09:06 AM, Patrick Begou wrote: >>> >>>> Hi, >>>> >>>> I'm working on a large parallel fortran application which give >>>> (sometime) a segfault. When this error occurs I would like to backtrace >>>> the call stack to know where it takes place but I'm unable to get this >>>> information, no more than a list of memory addresses. I've build a small >>>> test-case (with an error in array dimension creating a segmentation >>>> fault in a subroutine ) to investigate gfortran/gcc options. >>>> >>>> With gcc version 4.8.2 using options "-g -fbacktrace -gdwarf-3" I get >>>> ./plante >>>> Program received signal SIGSEGV: Segmentation fault - invalid memory >>>> reference. >>>> Backtrace for this error: >>>> #0 0x7F99F71A9AC7 >>>> #1 0x7F99F71AA0CE >>>> #2 0x7F99F67A9B2F >>>> Segmentation fault >>>> >>>> but addr2line -e ./plante 0x7F99F71AA0CE >>>> returns: ??:0 >>>> >>>> What have I missed ? >> >> Depending on which version of binutils you have, addr2line may have >> problems understanding the DWARF-3 debug format. Try with -gdwarf-2. >> >> >>> Hard to say. I have the same problem with a (far smaller) program of our >>> weather forecasting suite. Compiled with gfortran 4.9 and linked against >>> the >>> OpenMPI libraries, I get this: >>> >>> Program received signal SIGSEGV: Segmentation fault - invalid memory >>> reference. >>> >>> Backtrace for this error: >>> #0 0x2ADE9148E407 >>> #1 0x2ADE9148EA1E >>> #2 0x2ADE91F1C17F >>> #3 0x5EB24E in update_desc_ at update_desc.F90:55 >>> #4 0x5E97D9 in swapoutdb_ at swapoutdb.F90:16 (discriminator 4) >>> #5 0x40B259 in bator at Bator.F90:368 (discriminator 2) >>> >>> -------------------------------------------------------------------------- >>> mpirun noticed that process rank 0 with PID 27638 on node super.moene.org >>> exited on signal 11 (Segmentation fault). >>> >>> -------------------------------------------------------------------------- >>> >>> which looks reasonable to me. Perhaps the earlier addresses are simply >>> within the OpenMPI library routines. The error most certainly isn't >>> there, >>> but what you passed as arguments to it. >> >> Typically the first few stack frames are due to the backtrace printing >> routines in libgfortran. Another problem is that addr2line can't >> resolve addresses from shared libraries (incl. libgfortran.so); >> linking statically is a workaround.. I'd definitely try static linking, yes. Maybe valgrind helps to pinpoint the fault? I take it you're aware of the hints given in http://www.open-mpi.org/faq/?category=debugging HTH,