Re: Segfault in __c_f_f_c during strace of nptl application.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> I saw strace segfault in __canonicalize_funcptr_for_compare while
> trying to trace an nptl enabled hppa application.
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0002b3bc in __canonicalize_funcptr_for_compare ()
> Current language:  auto; currently asm
> (gdb) bt
> #0  0x0002b3bc in __canonicalize_funcptr_for_compare ()
> #1  0x00025fec in sys_rt_sigaction (tcp=0x4e070) at signal.c:1886
> #2  0x00017aec in trace_syscall (tcp=0x4e070) at syscall.c:2549
> #3  0x00016c98 in main (argc=<value optimized out>, argv=0xc032f01c)
> at strace.c:2475
> (gdb)
> 
> It's 100% reproducible. What should I try to debug this?

Look at the comparison in sys_rt_sigaction.  The last time this
happened, this involved a comparison with a special value that
wasn't a function pointer.  I think you are seeing the same problem
as Kyle.

I believe that the segv can be avoided by casting the values in the
comparison:

> > > which is:
> > > 
> > >                 if (sa.__sigaction_handler.__sa_handler == SIG_ERR)
> > >                         tprintf("{SIG_ERR, ");
> > 
> > Is __canonicalize_funcptr_for_compare choking on SIG_ERR?  This is
> > a special value (-1).  The plabel bit is set.

However, this is a cope out.  __canonicalize_funcptr_for_compare
actually faults on sa.__sigaction_handler.__sa_handler.

You can put a break on __canonicalize_funcptr_for_compare and look at
what's being passed in.

Looking at my email archive, I see the real cause involves kernel memory
maps:

  > > > On Wed, May 06, 2009 at 01:39:49PM -0400, John David Anglin wrote:
  > > > > > The tombstone is:
  > > > > > 
  > > > > > do_page_fault() pid=10205 command='strace' type=15 address=0x407d2f18
  > > > > > vm_start = 0x4068d000, vm_end = 0x4068f000
  > > > > 
  > > > > So, the pointer passed to __canonicalize_funcptr_for_compare is outside
  > > > > the vm range.
  > > > > 
  > > > > Maybe "info sharedlib" will show something.  Need to find out why the
  > > > > address of the function descriptor is outside the vm range.
  > > > > 
  > > > > > > 405c0000-405c2000 rwxp 405c0000 00:00 0 
  > > 
  > > The function pointer address is also outside this range.
  > > 
  > 
  > Sorry, this was with a rebuilt binary, and it lies within this range.
  
  It's marked rwxp, so why the fault?

We never figured out why the fault actually occurred (Kyle got busy).
It seems like there is a problem with the address mapping during signals.
However, there was some rebuilds in the above and I'm not sure the
analysis is correct.  However, I'm sure the problem isn't with
__canonicalize_funcptr_for_compare.

So, the quick fix to get strace going is to rebuild casting the function
pointers to long.  However, I think you will find that it has other problems.
You might have more success with the old version that Kyle patched a
year or so ago (posted in debian people).  Randolph was working on a program
called atrace.  I tried it but didn't have much luck with it.

PS: Hows NPTL comming?

Dave
-- 
J. David Anglin                                  dave.anglin@xxxxxxxxxxxxxx
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux