Re: Re:[RFC] Crash patch for DWARF CFI based unwind support

Dave Anderson <anderson@xxxxxxxxxx> · Mon, 23 Oct 2006 09:31:11 -0400

Rachita Kothiyal wrote:

> On Thu, Oct 19, 2006 at 05:15:32PM -0400, Dave Anderson wrote:
> >
> > > There still are a couple of things which need to be done, viz
> > > 1. Extend to obtaining unwind info from modules as well(currently
> > >    doing only for the kernel)
> > > 2. Currently reading the unwind info from eh_frame section only(ie
> > >    __start_unwind to __end_unwind). Need to add facility to read from
> > >    the .debug_frame(if .debug_frame is present in cases where .eh_frame
> > >    is absent. Will have to read from the vmlinux if we want to read the
> > >    .debug_frame info)
> >
> > Hi Rachita,
> >
> > I hope to be able to come up with a new crash version
> > for you to continue working with by tomorrow, Monday at
> > the latest.
> >
> > Off the top of my head, here's what I've done with your
> > initial patch:
> >
> > 1. As Ben mentioned, it need to be made compilable for
> >    other architectures.
> > 2. Renamed unwind_x86_64.c into unwind_x86_32_64.c,
> >    because the unwind code should be architecture
> >    neutral with respect to x86 and x86_64.  It's currently
> >    #ifdef'd to only be compile if X86_64, but when a
> >    new "unwind_x86.h" file is ready to go, it can be
> >    made usable by both arches.
> > 3. Made it capable of reading .eh_frame data from the
> >    vmlinux file if it is not in memory.
> > 4. Made it capable of reading all of the module's unwind
> >    tables.
> > 5. Restored the unwind() function to reflect the kernel
> >    version in that it new uses a new find_table() routine,
> >    which returns a pointer to the local copy of the unwind
> >    that contains the incoming pc.
> > 6. Cleaned up a bunch of cruft...
> >
>
> Hi Dave
>
> On the panic task, when we do the following:
>
>    set unwind on
>    bt
>    set unwind off
>    bt
>
> This last bt does not give us the same backtrace as what we get when crash
> first starts up(ie unwind is off by default). What is happening here is, when
> unwind is set to on, and we do a 'bt', we go to get_netdump_regs_x86_64() to get rsp and rip, where ASSIGN_SIZE(user_regs_struct) happens, thereby setting
> VALID_STRUCT(user_regs_struct) to 1. Now when we next do 'set unwind off' and
> 'bt', we satisfy the following if condition in get_netdump_regs_x86_64() as
> VALID_STRUCT(user_regs_struct) is set:
>
>  if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
>           VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||              (KDUMP_DUMPFILE() && (kt->flags & DWARF_UNWIND) &&
>           (bt->flags & BT_DUMPFILE_SEARCH))) {
>
> So this results in it reading the register values from the NT_PRSTATUS.
> Hence the backtrace looks different from what we get from the existing
> non-dwarf mechanism.
>
> To avoid this, we could use a local variable for the user_regs_struct size
> instead of changing things at the global scope with ASSIGN_SIZE(). Or
> invalidate the user_regs_struct before we leave from get_netdump_regs_x86_64().
>
> Or, if it is desired that registers be read for the panic task from the
> NT_PRSTATUS section in the normal non-dwarf backtrace mechanism (which
> currently does not work as expected because of the user_regs_struct
> initialisation problem in x86_64), then probably it will have to be fixed
> some other way.

Hmmm, yeah, good catch...

But what happens the second time around, anyway?  Are the RSP/RIP
starting points so different such that the low_budget tracer's output
is so drastically different?  Or does it go off into the weeds because
the other user_regs_struct register offsets (that don't get initialized)
cause an OFFSET() failure?

Dave

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility