Re: [PATCH] ppc64: fix 'bt' command for vmcore captured with fadump.

Hari Bathini <hbathini@xxxxxxxxxxxxxxxxxx> · Wed, 25 Jan 2017 00:02:00 +0530

On Tuesday 24 January 2017 11:53 PM, Dave Anderson wrote:

----- Original Message -----

On Monday 23 January 2017 11:43 PM, Dave Anderson wrote:
----- Original Message -----
On Saturday 21 January 2017 02:00 AM, Dave Anderson wrote:
----- Original Message -----

... [cut] ...

Also, the exception frame doesn't even show the [bracketed] type of
exception
that occurred -- it's just a register dump followed by the remainder of
the
backtrace.  Upon a quick glance, it's not obvious that they are even
active
tasks.  And traditionally, all of the other architectures have always
dumped
a full trace.

I'm not sure what the mechanism is for shutting down the non-active
FADUMP tasks, so that's why I asked if you could restrict this change
to just those types of dumps.  (For that matter, is it even possible to
differentiate a real kdump from an FADUMP dumpfile --  aside from a
Hi Dave,

Differentiating a kdump and fadump dumpfile is not possible except that
the
stack search would invariably fail and ptregs are guaranteed to be saved
by
firmware in case of fadump. Posted v2 that doesn't change bt output for
anything
but active tasks in case of fadump..
Ok, so let me get this straight.  The only difference I see with the v2
patch
is that fadump non-panicking active tasks change from something like
this:

     PID: 0      TASK: c000000000e7f6d0  CPU: 0   COMMAND: "swapper"
      #0 [c000000000f2ba30] (null) at 3aae291c67  (unreliable)
      #1 [c000000000f2bae0] .tick_dev_program_event at c0000000000d16fc
      #2 [c000000000f2bb90] .__hrtimer_start_range_ns at c0000000000c4bcc
      #3 [c000000000f2bcb0] .tick_nohz_stop_sched_tick at c0000000000d2d30
      #4 [c000000000f2bdc0] .cpu_idle at c000000000015bf0
      #5 [c000000000f2be70] .rest_init at c000000000009de4
      #6 [c000000000f2bef0] .start_kernel at c000000000850eb4
      #7 [c000000000f2bf90] .start_here_common at c0000000000083d8

to this:

     PID: 0      TASK: c000000000e7f6d0  CPU: 0   COMMAND: "swapper"
      #0 [c000000000f2bd50] (null) at 0  (unreliable)
      #1 [c000000000f2bdc0] .cpu_idle at c000000000015bf0
      #2 [c000000000f2be70] .rest_init at c000000000009de4
      #3 [c000000000f2bef0] .start_kernel at c000000000850eb4
      #4 [c000000000f2bf90] .start_here_common at c0000000000083d8

But with your v1 patch, you also dumped the exception frame:

     PID: 0      TASK: c000000000e7f6d0  CPU: 0   COMMAND: "swapper"
      R0:  0000000000000000    R1:  c000000000f2bd50    R2:
      c000000000f27628
      R3:  0000000000000000    R4:  0000000000000000    R5:
      8000000002144400
      R6:  800000001314c4f8    R7:  0000000000000000    R8:
      0000000000000000
      R9:  ffffffffffffffff    R10: 0000000000000000    R11:
      80003fbff901700c
      R12: 0000000000000000    R13: c000000000ff2500    R14:
      0000000001a3fa58
      R15: 00000000002230a8    R16: 0000000000223150    R17:
      0000000000223144
      R18: 0000000000c8a098    R19: 0000000002b13a58    R20:
      0000000000000000
      R21: 0000000002b135d8    R22: 0000000002b13530    R23:
      0000000002280000
      R24: 0000000002b135f0    R25: c000000000fd5c48    R26:
      c0000000010942f0
      R27: c0000000010942f0    R28: c0000000005fd168    R29:
      0000000000000008
      R30: c000000000eb1d68    R31: c000000000f28080
      NIP: c000000000055730    MSR: 8000000000009032    OR3:
      0000000000000000
      CTR: 0000000000000000    LR:  c000000000057350    XER:
      0000000000000000
      CCR: 0000000024000048    MQ:  0000000000000000    DAR:
      000001000ad763b0
      DSISR: 0000000000000000     Syscall Result: 0000000000000000
      NIP [c000000000055730] .plpar_hcall_norets
      LR  [c000000000057350] .pseries_shared_idle_sleep
      #0 [c000000000f2bd50] (null) at 0  (unreliable)
      #1 [c000000000f2bdc0] .cpu_idle at c000000000015bf0
      #2 [c000000000f2be70] .rest_init at c000000000009de4
      #3 [c000000000f2bef0] .start_kernel at c000000000850eb4
      #4 [c000000000f2bf90] .start_here_common at c0000000000083d8

Again, I don't understand how the non-panicking active tasks are stopped
by the fadump facility, but is it because you cannot differentiate kdumps
from fadumps that you don't show the exception frame with the v2 patch?
Hi Dave,

The crashing cpu makes rtas call ibm,os-term to the firmware which
saves the regs info of all online cpus. AFAIK, there is no exception frame
marker (which we are using to detect one) set for this stack frames
by the kernel. With v1, I was printing the registers without looking for
exception frame marker, if the registers are saved..

Would it be possible to also show the exception frame type in brackets
and
the register dump for those fadump non-panicking active tasks?

Hmmm.. Let me have a hard look at this.
Will try and improve this..
Hari,

I was tinkering around with ppc64_get_dumpfile_stack_frame() from your v2
patch,
and this seems to work:

          else {
                  *ksp = pt_regs->gpr[1];
                  if (IS_KVADDR(*ksp)) {
                          readmem(*ksp+16, KVADDR, nip, sizeof(ulong),
                                  "Regs NIP value", FAULT_ON_ERROR);
+                       ppc64_print_regs(pt_regs);
                          return TRUE;
                  } else {
                          if (IN_TASK_VMA(bt_in->task, *ksp))
                                  fprintf(fp, "%0lx: Task is running in user
                                  space\n",
                                          bt_in->task);
                          else
                                  fprintf(fp, "%0lx: Invalid Stack Pointer
                                  %0lx\n",
                                          bt_in->task, *ksp);
                          *nip = pt_regs->nip;
                          ppc64_print_regs(pt_regs);
                          return FALSE;
                  }
          }

And if the task were to have been running in userspace, it already dumps
the
registers in the "else" section above.

I see that the regs->trap is 0, so I understand now that there's nothing to
translate w/respect to the exception frame type, but a follow-up
translation
of the NIP and LR would at least show that there was some kind of hypercall
involved.  (Whether it can be firmly determined whether FADUMP was
responsible
is another question)

Hi Dave,

I did think of it but I was wary considering two register prints like below,
if there is an exception frame..

      PID: 2121   TASK: c0000001af90c600  CPU: 2   COMMAND: "sshd"
       R0:  c0000000003e5280    R1:  c0000001ae047a30    R2:
c000000000fd5a00
       R3:  0000000000000001    R4:  000000000000019e    R5:
000000000000000f
       R6:  0000000000000004    R7:  c0000001ae047bb8    R8:
00000000000b3d9f
       R9:  00000000000000f0    R10: 0000000000000678    R11:
c0000000008e0f38
       R12: c0000000003e6310    R13: c00000000b781200    R14:
0000000000000000
       R15: 0000000000000000    R16: 000001000b7dad70    R17:
000000005dfd3c08
       R18: 000000005dfd2838    R19: 00003ffff81eb620    R20:
000000005df74128
       R21: 000001000b7d89a0    R22: 000000000000de4c    R23:
000000005df73b30
       R24: 000000005dfd3c88    R25: 00003ffff81eb428    R26:
c0000001ae047bb8
       R27: c0000001b17f4d80    R28: c000000000c60580    R29:
000000000000019e
       R30: 000000000000000f    R31: 000000000000090b
       NIP: 00003fffb6ac8400    MSR: 800000000000d033    OR3:
0000000000000000
       CTR: c0000000003e6310    LR:  c0000000003e493c    XER:
0000000020000000
       CCR: 0000000024004824    MQ:  0000000000000000    DAR:
000001000b7e1640
       DSISR: 0000000002000000     Syscall Result: 0000000000000000
       #0 [c0000001ae047a30] (null) at c0000000fd783c00  (unreliable)
       #1 [c0000001ae047a70] avc_has_perm at c0000000003e5280
       #2 [c0000001ae047b60] sock_has_perm at c0000000003e6238
       #3 [c0000001ae047be0] security_socket_sendmsg at c0000000003e28fc
       #4 [c0000001ae047c30] sock_sendmsg at c00000000072d53c
       #5 [c0000001ae047c60] sock_write_iter at c00000000072d644
       #6 [c0000001ae047d00] __vfs_write at c0000000002ed97c
       #7 [c0000001ae047d90] vfs_write at c0000000002ef328
       #8 [c0000001ae047de0] sys_write at c0000000002f0f00
       #9 [c0000001ae047e30] system_call at c00000000000b184
       System Call [c00] exception frame:
       R0:  0000000000000004    R1:  00003ffff81eb220    R2:
00003fffb6b99800
       R3:  0000000000000003    R4:  000001000b80e3c0    R5:
0000000000000034
       R6:  00003ffff81eb2e4    R7:  000000000000021e    R8:
0000000000000000
       R9:  0000000000000000    R10: 0000000000000000    R11:
0000000000000000
       R12: 0000000000000000    R13: 00003fffb6497730    R14:
0000000000000000
       R15: 0000000000000000    R16: 000001000b7dad70    R17:
000000005dfd3c08
       R18: 000000005dfd2838    R19: 00003ffff81eb620    R20:
000000005df74128
       R21: 000001000b7d89a0    R22: 000000000000de4c    R23:
000000005df73b30
       R24: 000000005dfd3c88    R25: 00003ffff81eb428    R26:
00003ffff81eb430
       R27: 00003ffff81eb420    R28: 00003ffff81eb424    R29:
00003ffff81eb2e4
       R30: 000001000b80e3c0    R31: 0000000000000034
       NIP: 00003fffb6ac8400    MSR: 800000000000d033    OR3:
0000000000000003
       CTR: 0000000000000000    LR:  000000005df1c3e4    XER:
0000000000000000
       CCR: 0000000044004824    MQ:  0000000000000001    DAR:
00003fffb729c590
       DSISR: 000000000a000000     Syscall Result: 0000000000000000

instead of this..

      PID: 2121   TASK: c0000001af90c600  CPU: 2   COMMAND: "sshd"
       #0 [c0000001ae047a30] (null) at c0000000fd783c00  (unreliable)
       #1 [c0000001ae047a70] avc_has_perm at c0000000003e5280
       #2 [c0000001ae047b60] sock_has_perm at c0000000003e6238
       #3 [c0000001ae047be0] security_socket_sendmsg at c0000000003e28fc
       #4 [c0000001ae047c30] sock_sendmsg at c00000000072d53c
       #5 [c0000001ae047c60] sock_write_iter at c00000000072d644
       #6 [c0000001ae047d00] __vfs_write at c0000000002ed97c
       #7 [c0000001ae047d90] vfs_write at c0000000002ef328
       #8 [c0000001ae047de0] sys_write at c0000000002f0f00
       #9 [c0000001ae047e30] system_call at c00000000000b184
       System Call [c00] exception frame:
       R0:  0000000000000004    R1:  00003ffff81eb220    R2:
00003fffb6b99800
       R3:  0000000000000003    R4:  000001000b80e3c0    R5:
0000000000000034
       R6:  00003ffff81eb2e4    R7:  000000000000021e    R8:
0000000000000000
       R9:  0000000000000000    R10: 0000000000000000    R11:
0000000000000000
       R12: 0000000000000000    R13: 00003fffb6497730    R14:
0000000000000000
       R15: 0000000000000000    R16: 000001000b7dad70    R17:
000000005dfd3c08
       R18: 000000005dfd2838    R19: 00003ffff81eb620    R20:
000000005df74128
       R21: 000001000b7d89a0    R22: 000000000000de4c    R23:
000000005df73b30
       R24: 000000005dfd3c88    R25: 00003ffff81eb428    R26:
00003ffff81eb430
       R27: 00003ffff81eb420    R28: 00003ffff81eb424    R29:
00003ffff81eb2e4
       R30: 000001000b80e3c0    R31: 0000000000000034
       NIP: 00003fffb6ac8400    MSR: 800000000000d033    OR3:
0000000000000003
       CTR: 0000000000000000    LR:  000000005df1c3e4    XER:
0000000000000000
       CCR: 0000000044004824    MQ:  0000000000000001    DAR:
00003fffb729c590
       DSISR: 000000000a000000     Syscall Result: 0000000000000000

On second thought, that may not be bad after all??
So, I am ok with the change you propose.
Hmmm, except that in the "sshd" sample showing the firmware-generated eframe,
and which the task was presumably running in kernel space when firmware took
over (?), it has a userspace NIP of 00003fffb6ac8400.  What's happening there?

IIUC, NIP 00003fffb6ac8400 must have caused the exception (system call 
in this case),
and the backtrace shows the kernel call stack following the system call?

Thanks
Hari

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility