Re: [PATCH] arm64: fix backtraces of KASAN kernel dumpfile truncated

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 7, 2022 at 11:22 AM dinghui <dinghui@xxxxxxxxxxxxxx> wrote:
>
> Hi, lijiang
>
> On 2022/12/7 10:55, lijiang wrote:
> > Hi, Ding
> > Thank you for the fix.
> > On Thu, Dec 1, 2022 at 8:00 PM <crash-utility-request@xxxxxxxxxx> wrote:
> >> Date: Thu,  1 Dec 2022 15:01:45 +0800
> >> From: Ding Hui <dinghui@xxxxxxxxxxxxxx>
> >> To: crash-utility@xxxxxxxxxx
> >> Cc: Ding Hui <dinghui@xxxxxxxxxxxxxx>
> >> Subject:  [PATCH] arm64: fix backtraces of KASAN kernel
> >>          dumpfile truncated
> >> Message-ID: <20221201070145.10830-1-dinghui@xxxxxxxxxxxxxx>
> >> Content-Type: text/plain; charset="US-ASCII"; x-default=true
> >>
> >> We met "bt" cmd on KASAN kernel vmcore display truncated backtraces
> >> like this:
> >>
> >> crash> bt
> >> PID: 4131   TASK: ffff8001521df000  CPU: 3   COMMAND: "bash"
> >>   #0 [ffff2000224b0cb0] machine_kexec_prepare at ffff2000200bff4c
> >>
> >
> > Can you help explain how to reproduce this issue? Or what's the
> > critical configuration of the kernel?  Can this issue be always
> > reproduced?
> >
> > I tried to reproduce it on the latest kernel(commit bce9332220bd )
> > with the CONFIG_KASAN*=y, but can not reproduce this error.
> >
>
> I can always reproduce the issue with openEuler-20.03-LTS-SP3.
> I recompile openEuler kernel-4.19.90-xxx with CONFIG_KASAN=y, the gcc is
> also from openEuler-20.03-LTS-SP3, version 7.3.0
>
> I think it may be related to compiler, do you see the similar assembly
> pattern when disassemble machine_kexec() or crash_save_cpu()?
>

Tried several times,  the similar phenomenon can be observed on
aarch64(built with the gcc-8.3.1) and this error can be reproduced.

After applying this patch, the crash tool works well for now(with or
without the current issue).

And this change looks good to me. So: Ack

Thanks.
Lianbo

> > crash> bt
> > PID: 23838    TASK: ffff3d83abe61300  CPU: 2    COMMAND: "bash"
> >   #0 [ffff800010587650] machine_kexec at ffffbb61b39bd460
> >   #1 [ffff8000105876a0] __crash_kexec at ffffbb61b3b8e2fc
> >   #2 [ffff8000105878c0] panic at ffffbb61b4b6c158
> >   #3 [ffff8000105879f0] sysrq_handle_crash at ffffbb61b43b1144
> >   #4 [ffff800010587a00] __handle_sysrq at ffffbb61b43b1ddc
> >   #5 [ffff800010587a80] write_sysrq_trigger at ffffbb61b43b26e0
> >   #6 [ffff800010587ab0] proc_reg_write at ffffbb61b3f9c4ac
> >   #7 [ffff800010587af0] vfs_write at ffffbb61b3ebeaf4
> >   #8 [ffff800010587c30] ksys_write at ffffbb61b3ebf054
> >   #9 [ffff800010587cd0] __arm64_sys_write at ffffbb61b3ebf144
> > #10 [ffff800010587df0] invoke_syscall.constprop.0 at ffffbb61b39afc88
> > #11 [ffff800010587e30] el0_svc_common.constprop.0 at ffffbb61b39afeac
> > #12 [ffff800010587e60] do_el0_svc at ffffbb61b39aff08
> > #13 [ffff800010587e80] el0_svc at ffffbb61b4b8defc
> > #14 [ffff800010587ea0] el0t_64_sync_handler at ffffbb61b4b8e274
> > #15 [ffff800010587fe0] el0t_64_sync at ffffbb61b39915bc
> >       PC: 0000ffffa42a4880   LR: 0000ffffa424321c   SP: 0000ffffe889dd60
> >      X29: 0000ffffe889dd60  X28: 0000000000000000  X27: 0000aaaad1c10000
> >      X26: 0000aaaad1c73018  X25: 0000aaaad1cb89bc  X24: 0000000000000002
> >      X23: 0000aaaadd9ce140  X22: 0000ffffa43e77e0  X21: 0000ffffa436a5c0
> >      X20: 0000aaaadd9ce140  X19: 0000000000000001  X18: 0000000000000000
> >      X17: 0000ffffa423ff50  X16: 0000ffffa4244600  X15: 0000ffffa4311350
> >      X14: 0000000000000000  X13: 0000000000000000  X12: 0000000000000000
> >      X11: 0000000000000020  X10: 0000000000000063   X9: 0000aaaadd9dc2c0
> >       X8: 0000000000000040   X7: 00000000ffffffff   X6: 0000000000000063
> >       X5: 0000aaaadd9ce141   X4: 0000aaaadd9dc2c1   X3: 0000ffffa43e7020
> >       X2: 0000000000000002   X1: 0000aaaadd9ce140   X0: 0000000000000001
> >      ORIG_X0: 0000000000000001  SYSCALLNO: 40  PSTATE: 20000000
> > crash>
> >
> >
> > Thanks.
> > Lianbo
> >
> >> After digging the root cause, it turns out that arm64_in_kdump_text()
> >> found wrong bt->bptr at "machine_kexec" branch.
> >>
> >> Disassemble machine_kexec() of KASAN vmlinux (gcc 7.3.0):
> >>
> >> crash> dis -x machine_kexec
> >> 0xffff2000200bff50 <machine_kexec>:     stp     x29, x30, [sp,#-208]!
> >> 0xffff2000200bff54 <machine_kexec+0x4>: mov     x29, sp
> >> 0xffff2000200bff58 <machine_kexec+0x8>: stp     x19, x20, [sp,#16]
> >> 0xffff2000200bff5c <machine_kexec+0xc>: str     x24, [sp,#56]
> >> 0xffff2000200bff60 <machine_kexec+0x10>:        str     x26, [sp,#72]
> >> 0xffff2000200bff64 <machine_kexec+0x14>:        mov     x2, #0x8ab3
> >> 0xffff2000200bff68 <machine_kexec+0x18>:        add     x1, x29, #0x70
> >> 0xffff2000200bff6c <machine_kexec+0x1c>:        lsr     x1, x1, #3
> >> 0xffff2000200bff70 <machine_kexec+0x20>:        movk    x2, #0x41b5, lsl #16
> >> 0xffff2000200bff74 <machine_kexec+0x24>:        mov     x19, #0x200000000000
> >> 0xffff2000200bff78 <machine_kexec+0x28>:        adrp    x3, 0xffff2000224b0000
> >> 0xffff2000200bff7c <machine_kexec+0x2c>:        movk    x19, #0xdfff, lsl #48
> >> 0xffff2000200bff80 <machine_kexec+0x30>:        add     x3, x3, #0xcb0
> >> 0xffff2000200bff84 <machine_kexec+0x34>:        add     x4, x1, x19
> >> 0xffff2000200bff88 <machine_kexec+0x38>:        stp     x2, x3, [x29,#112]
> >> 0xffff2000200bff8c <machine_kexec+0x3c>:        adrp    x2, 0xffff2000200bf000 <swsusp_arch_resume+0x1e8>
> >> 0xffff2000200bff90 <machine_kexec+0x40>:        add     x2, x2, #0xf50
> >> 0xffff2000200bff94 <machine_kexec+0x44>:        str     x2, [x29,#128]
> >> 0xffff2000200bff98 <machine_kexec+0x48>:        mov     w2, #0xf1f1f1f1
> >> 0xffff2000200bff9c <machine_kexec+0x4c>:        str     w2, [x1,x19]
> >> 0xffff2000200bffa0 <machine_kexec+0x50>:        mov     w2, #0xf200
> >> 0xffff2000200bffa4 <machine_kexec+0x54>:        mov     w1, #0xf3f3f3f3
> >> 0xffff2000200bffa8 <machine_kexec+0x58>:        movk    w2, #0xf2f2, lsl #16
> >> 0xffff2000200bffac <machine_kexec+0x5c>:        stp     w2, w1, [x4,#4]
> >>
> >> We notice that:
> >> 1. machine_kexec() start address is 0xffff2000200bff50
> >> 2. the instruction at machine_kexec+0x44 store the same value
> >>     0xffff2000200bff50 (comes from 0xffff2000200bf000 + 0xf50)
> >>     into stack postion [x29,#128].
> >>
> >> When arm64_in_kdump_text() search LR from stack, it met
> >> 0xffff2000200bff50 firstly, so got wrong bt->bptr.
> >>
> >> We know that the real LR is always great than the start address
> >> of a function, so let's fix it by change the search conditon to
> >> (*ptr > xxx_start) && (*ptr < xxx_end).
> >>
> >> Signed-off-by: Ding Hui <dinghui@xxxxxxxxxxxxxx>
> >> ---
> >>   arm64.c | 18 +++++++++---------
> >>   1 file changed, 9 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/arm64.c b/arm64.c
> >> index c3e26a3..7e8a7db 100644
> >> --- a/arm64.c
> >> +++ b/arm64.c
> >> @@ -3479,7 +3479,7 @@ arm64_in_kdump_text(struct bt_info *bt, struct arm64_stackframe *frame)
> >>          ms = machdep->machspec;
> >>          for (ptr = start - 8; ptr >= base; ptr--) {
> >>                  if (bt->flags & BT_OPT_BACK_TRACE) {
> >> -                       if ((*ptr >= ms->crash_kexec_start) &&
> >> +                       if ((*ptr > ms->crash_kexec_start) &&
> >>                              (*ptr < ms->crash_kexec_end) &&
> >>                              INSTACK(*(ptr - 1), bt)) {
> >>                                  bt->bptr = ((ulong)(ptr - 1) - (ulong)base)
> >> @@ -3488,7 +3488,7 @@ arm64_in_kdump_text(struct bt_info *bt, struct arm64_stackframe *frame)
> >>                                          fprintf(fp, "%lx: %lx (crash_kexec)\n", bt->bptr, *ptr);
> >>                                  return TRUE;
> >>                          }
> >> -                       if ((*ptr >= ms->crash_save_cpu_start) &&
> >> +                       if ((*ptr > ms->crash_save_cpu_start) &&
> >>                              (*ptr < ms->crash_save_cpu_end) &&
> >>                              INSTACK(*(ptr - 1), bt)) {
> >>                                  bt->bptr = ((ulong)(ptr - 1) - (ulong)base)
> >> @@ -3498,14 +3498,14 @@ arm64_in_kdump_text(struct bt_info *bt, struct arm64_stackframe *frame)
> >>                                  return TRUE;
> >>                          }
> >>                  } else {
> >> -                       if ((*ptr >= ms->machine_kexec_start) && (*ptr < ms->machine_kexec_end)) {
> >> +                       if ((*ptr > ms->machine_kexec_start) && (*ptr < ms->machine_kexec_end)) {
> >>                                  bt->bptr = ((ulong)ptr - (ulong)base)
> >>                                             + task_to_stackbase(bt->tc->task);
> >>                                  if (CRASHDEBUG(1))
> >>                                          fprintf(fp, "%lx: %lx (machine_kexec)\n", bt->bptr, *ptr);
> >>                                  return TRUE;
> >>                          }
> >> -                       if ((*ptr >= ms->crash_kexec_start) && (*ptr < ms->crash_kexec_end)) {
> >> +                       if ((*ptr > ms->crash_kexec_start) && (*ptr < ms->crash_kexec_end)) {
> >>                                  /*
> >>                                   *  Stash the first crash_kexec frame in case the machine_kexec
> >>                                   *  frame is not found.
> >> @@ -3519,7 +3519,7 @@ arm64_in_kdump_text(struct bt_info *bt, struct arm64_stackframe *frame)
> >>                                  }
> >>                                  continue;
> >>                          }
> >> -                       if ((*ptr >= ms->crash_save_cpu_start) && (*ptr < ms->crash_save_cpu_end)) {
> >> +                       if ((*ptr > ms->crash_save_cpu_start) && (*ptr < ms->crash_save_cpu_end)) {
> >>                                  bt->bptr = ((ulong)ptr - (ulong)base)
> >>                                             + task_to_stackbase(bt->tc->task);
> >>                                  if (CRASHDEBUG(1))
> >> @@ -3566,7 +3566,7 @@ arm64_in_kdump_text_on_irq_stack(struct bt_info *bt)
> >>
> >>          for (ptr = start - 8; ptr >= base; ptr--) {
> >>                  if (bt->flags & BT_OPT_BACK_TRACE) {
> >> -                       if ((*ptr >= ms->crash_kexec_start) &&
> >> +                       if ((*ptr > ms->crash_kexec_start) &&
> >>                              (*ptr < ms->crash_kexec_end) &&
> >>                              INSTACK(*(ptr - 1), bt)) {
> >>                                  bt->bptr = ((ulong)(ptr - 1) - (ulong)base) + stackbase;
> >> @@ -3576,7 +3576,7 @@ arm64_in_kdump_text_on_irq_stack(struct bt_info *bt)
> >>                                  FREEBUF(stackbuf);
> >>                                  return TRUE;
> >>                          }
> >> -                       if ((*ptr >= ms->crash_save_cpu_start) &&
> >> +                       if ((*ptr > ms->crash_save_cpu_start) &&
> >>                              (*ptr < ms->crash_save_cpu_end) &&
> >>                              INSTACK(*(ptr - 1), bt)) {
> >>                                  bt->bptr = ((ulong)(ptr - 1) - (ulong)base) + stackbase;
> >> @@ -3587,7 +3587,7 @@ arm64_in_kdump_text_on_irq_stack(struct bt_info *bt)
> >>                                  return TRUE;
> >>                          }
> >>                  } else {
> >> -                       if ((*ptr >= ms->crash_kexec_start) && (*ptr < ms->crash_kexec_end)) {
> >> +                       if ((*ptr > ms->crash_kexec_start) && (*ptr < ms->crash_kexec_end)) {
> >>                                  bt->bptr = ((ulong)ptr - (ulong)base) + stackbase;
> >>                                  if (CRASHDEBUG(1))
> >>                                          fprintf(fp, "%lx: %lx (crash_kexec on IRQ stack)\n",
> >> @@ -3595,7 +3595,7 @@ arm64_in_kdump_text_on_irq_stack(struct bt_info *bt)
> >>                                  FREEBUF(stackbuf);
> >>                                  return TRUE;
> >>                          }
> >> -                       if ((*ptr >= ms->crash_save_cpu_start) && (*ptr < ms->crash_save_cpu_end)) {
> >> +                       if ((*ptr > ms->crash_save_cpu_start) && (*ptr < ms->crash_save_cpu_end)) {
> >>                                  bt->bptr = ((ulong)ptr - (ulong)base) + stackbase;
> >>                                  if (CRASHDEBUG(1))
> >>                                          fprintf(fp, "%lx: %lx (crash_save_cpu on IRQ stack)\n",
> >> --
> >> 2.17.1
> >
> >
>
> --
> Thanks,
> - Ding Hui
>

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux