Re: [PATCH] ppc64: fix 'bt' command for vmcore captured with fadump.

Dave Anderson <anderson@xxxxxxxxxx> · Tue, 24 Jan 2017 14:17:03 -0500 (EST)

----- Original Message -----
> 
> 
> On Tuesday 24 January 2017 11:53 PM, Dave Anderson wrote:
> >
> > ----- Original Message -----
> >>
> >> On Monday 23 January 2017 11:43 PM, Dave Anderson wrote:
> >>> ----- Original Message -----
> >>>> On Saturday 21 January 2017 02:00 AM, Dave Anderson wrote:
> >>>>> ----- Original Message -----
> >>>>>
> >>>>> ... [cut] ...
> >>>>>
> >>>>>>> Also, the exception frame doesn't even show the [bracketed] type of
> >>>>>>> exception
> >>>>>>> that occurred -- it's just a register dump followed by the remainder
> >>>>>>> of
> >>>>>>> the
> >>>>>>> backtrace.  Upon a quick glance, it's not obvious that they are even
> >>>>>>> active
> >>>>>>> tasks.  And traditionally, all of the other architectures have always
> >>>>>>> dumped
> >>>>>>> a full trace.
> >>>>>>>
> >>>>>>> I'm not sure what the mechanism is for shutting down the non-active
> >>>>>>> FADUMP tasks, so that's why I asked if you could restrict this change
> >>>>>>> to just those types of dumps.  (For that matter, is it even possible
> >>>>>>> to
> >>>>>>> differentiate a real kdump from an FADUMP dumpfile --  aside from a
> >>>>>> Hi Dave,
> >>>>>>
> >>>>>> Differentiating a kdump and fadump dumpfile is not possible except
> >>>>>> that
> >>>>>> the
> >>>>>> stack search would invariably fail and ptregs are guaranteed to be
> >>>>>> saved
> >>>>>> by
> >>>>>> firmware in case of fadump. Posted v2 that doesn't change bt output
> >>>>>> for
> >>>>>> anything
> >>>>>> but active tasks in case of fadump..
> >>>>> Ok, so let me get this straight.  The only difference I see with the v2
> >>>>> patch
> >>>>> is that fadump non-panicking active tasks change from something like
> >>>>> this:
> >>>>>      
> >>>>>      PID: 0      TASK: c000000000e7f6d0  CPU: 0   COMMAND: "swapper"
> >>>>>       #0 [c000000000f2ba30] (null) at 3aae291c67  (unreliable)
> >>>>>       #1 [c000000000f2bae0] .tick_dev_program_event at c0000000000d16fc
> >>>>>       #2 [c000000000f2bb90] .__hrtimer_start_range_ns at
> >>>>>       c0000000000c4bcc
> >>>>>       #3 [c000000000f2bcb0] .tick_nohz_stop_sched_tick at
> >>>>>       c0000000000d2d30
> >>>>>       #4 [c000000000f2bdc0] .cpu_idle at c000000000015bf0
> >>>>>       #5 [c000000000f2be70] .rest_init at c000000000009de4
> >>>>>       #6 [c000000000f2bef0] .start_kernel at c000000000850eb4
> >>>>>       #7 [c000000000f2bf90] .start_here_common at c0000000000083d8
> >>>>>      
> >>>>> to this:
> >>>>>      
> >>>>>      PID: 0      TASK: c000000000e7f6d0  CPU: 0   COMMAND: "swapper"
> >>>>>       #0 [c000000000f2bd50] (null) at 0  (unreliable)
> >>>>>       #1 [c000000000f2bdc0] .cpu_idle at c000000000015bf0
> >>>>>       #2 [c000000000f2be70] .rest_init at c000000000009de4
> >>>>>       #3 [c000000000f2bef0] .start_kernel at c000000000850eb4
> >>>>>       #4 [c000000000f2bf90] .start_here_common at c0000000000083d8
> >>>>>      
> >>>>> But with your v1 patch, you also dumped the exception frame:
> >>>>>      
> >>>>>      PID: 0      TASK: c000000000e7f6d0  CPU: 0   COMMAND: "swapper"
> >>>>>       R0:  0000000000000000    R1:  c000000000f2bd50    R2:
> >>>>>       c000000000f27628
> >>>>>       R3:  0000000000000000    R4:  0000000000000000    R5:
> >>>>>       8000000002144400
> >>>>>       R6:  800000001314c4f8    R7:  0000000000000000    R8:
> >>>>>       0000000000000000
> >>>>>       R9:  ffffffffffffffff    R10: 0000000000000000    R11:
> >>>>>       80003fbff901700c
> >>>>>       R12: 0000000000000000    R13: c000000000ff2500    R14:
> >>>>>       0000000001a3fa58
> >>>>>       R15: 00000000002230a8    R16: 0000000000223150    R17:
> >>>>>       0000000000223144
> >>>>>       R18: 0000000000c8a098    R19: 0000000002b13a58    R20:
> >>>>>       0000000000000000
> >>>>>       R21: 0000000002b135d8    R22: 0000000002b13530    R23:
> >>>>>       0000000002280000
> >>>>>       R24: 0000000002b135f0    R25: c000000000fd5c48    R26:
> >>>>>       c0000000010942f0
> >>>>>       R27: c0000000010942f0    R28: c0000000005fd168    R29:
> >>>>>       0000000000000008
> >>>>>       R30: c000000000eb1d68    R31: c000000000f28080
> >>>>>       NIP: c000000000055730    MSR: 8000000000009032    OR3:
> >>>>>       0000000000000000
> >>>>>       CTR: 0000000000000000    LR:  c000000000057350    XER:
> >>>>>       0000000000000000
> >>>>>       CCR: 0000000024000048    MQ:  0000000000000000    DAR:
> >>>>>       000001000ad763b0
> >>>>>       DSISR: 0000000000000000     Syscall Result: 0000000000000000
> >>>>>       NIP [c000000000055730] .plpar_hcall_norets
> >>>>>       LR  [c000000000057350] .pseries_shared_idle_sleep
> >>>>>       #0 [c000000000f2bd50] (null) at 0  (unreliable)
> >>>>>       #1 [c000000000f2bdc0] .cpu_idle at c000000000015bf0
> >>>>>       #2 [c000000000f2be70] .rest_init at c000000000009de4
> >>>>>       #3 [c000000000f2bef0] .start_kernel at c000000000850eb4
> >>>>>       #4 [c000000000f2bf90] .start_here_common at c0000000000083d8
> >>>>>      
> >>>>> Again, I don't understand how the non-panicking active tasks are
> >>>>> stopped
> >>>>> by the fadump facility, but is it because you cannot differentiate
> >>>>> kdumps
> >>>>> from fadumps that you don't show the exception frame with the v2 patch?
> >>>> Hi Dave,
> >>>>
> >>>> The crashing cpu makes rtas call ibm,os-term to the firmware which
> >>>> saves the regs info of all online cpus. AFAIK, there is no exception
> >>>> frame
> >>>> marker (which we are using to detect one) set for this stack frames
> >>>> by the kernel. With v1, I was printing the registers without looking for
> >>>> exception frame marker, if the registers are saved..
> >>>>
> >>>>> Would it be possible to also show the exception frame type in brackets
> >>>>> and
> >>>>> the register dump for those fadump non-panicking active tasks?
> >>>>>
> >>>> Hmmm.. Let me have a hard look at this.
> >>>> Will try and improve this..
> >>> Hari,
> >>>
> >>> I was tinkering around with ppc64_get_dumpfile_stack_frame() from your v2
> >>> patch,
> >>> and this seems to work:
> >>>
> >>>           else {
> >>>                   *ksp = pt_regs->gpr[1];
> >>>                   if (IS_KVADDR(*ksp)) {
> >>>                           readmem(*ksp+16, KVADDR, nip, sizeof(ulong),
> >>>                                   "Regs NIP value", FAULT_ON_ERROR);
> >>> +                       ppc64_print_regs(pt_regs);
> >>>                           return TRUE;
> >>>                   } else {
> >>>                           if (IN_TASK_VMA(bt_in->task, *ksp))
> >>>                                   fprintf(fp, "%0lx: Task is running in
> >>>                                   user
> >>>                                   space\n",
> >>>                                           bt_in->task);
> >>>                           else
> >>>                                   fprintf(fp, "%0lx: Invalid Stack
> >>>                                   Pointer
> >>>                                   %0lx\n",
> >>>                                           bt_in->task, *ksp);
> >>>                           *nip = pt_regs->nip;
> >>>                           ppc64_print_regs(pt_regs);
> >>>                           return FALSE;
> >>>                   }
> >>>           }
> >>>
> >>> And if the task were to have been running in userspace, it already dumps
> >>> the
> >>> registers in the "else" section above.
> >>>
> >>> I see that the regs->trap is 0, so I understand now that there's nothing
> >>> to
> >>> translate w/respect to the exception frame type, but a follow-up
> >>> translation
> >>> of the NIP and LR would at least show that there was some kind of
> >>> hypercall
> >>> involved.  (Whether it can be firmly determined whether FADUMP was
> >>> responsible
> >>> is another question)
> >>>
> >>>
> >> Hi Dave,
> >>
> >> I did think of it but I was wary considering two register prints like
> >> below,
> >> if there is an exception frame..
> >>
> >>       PID: 2121   TASK: c0000001af90c600  CPU: 2   COMMAND: "sshd"
> >>        R0:  c0000000003e5280    R1:  c0000001ae047a30    R2:
> >> c000000000fd5a00
> >>        R3:  0000000000000001    R4:  000000000000019e    R5:
> >> 000000000000000f
> >>        R6:  0000000000000004    R7:  c0000001ae047bb8    R8:
> >> 00000000000b3d9f
> >>        R9:  00000000000000f0    R10: 0000000000000678    R11:
> >> c0000000008e0f38
> >>        R12: c0000000003e6310    R13: c00000000b781200    R14:
> >> 0000000000000000
> >>        R15: 0000000000000000    R16: 000001000b7dad70    R17:
> >> 000000005dfd3c08
> >>        R18: 000000005dfd2838    R19: 00003ffff81eb620    R20:
> >> 000000005df74128
> >>        R21: 000001000b7d89a0    R22: 000000000000de4c    R23:
> >> 000000005df73b30
> >>        R24: 000000005dfd3c88    R25: 00003ffff81eb428    R26:
> >> c0000001ae047bb8
> >>        R27: c0000001b17f4d80    R28: c000000000c60580    R29:
> >> 000000000000019e
> >>        R30: 000000000000000f    R31: 000000000000090b
> >>        NIP: 00003fffb6ac8400    MSR: 800000000000d033    OR3:
> >> 0000000000000000
> >>        CTR: c0000000003e6310    LR:  c0000000003e493c    XER:
> >> 0000000020000000
> >>        CCR: 0000000024004824    MQ:  0000000000000000    DAR:
> >> 000001000b7e1640
> >>        DSISR: 0000000002000000     Syscall Result: 0000000000000000
> >>        #0 [c0000001ae047a30] (null) at c0000000fd783c00  (unreliable)
> >>        #1 [c0000001ae047a70] avc_has_perm at c0000000003e5280
> >>        #2 [c0000001ae047b60] sock_has_perm at c0000000003e6238
> >>        #3 [c0000001ae047be0] security_socket_sendmsg at c0000000003e28fc
> >>        #4 [c0000001ae047c30] sock_sendmsg at c00000000072d53c
> >>        #5 [c0000001ae047c60] sock_write_iter at c00000000072d644
> >>        #6 [c0000001ae047d00] __vfs_write at c0000000002ed97c
> >>        #7 [c0000001ae047d90] vfs_write at c0000000002ef328
> >>        #8 [c0000001ae047de0] sys_write at c0000000002f0f00
> >>        #9 [c0000001ae047e30] system_call at c00000000000b184
> >>        System Call [c00] exception frame:
> >>        R0:  0000000000000004    R1:  00003ffff81eb220    R2:
> >> 00003fffb6b99800
> >>        R3:  0000000000000003    R4:  000001000b80e3c0    R5:
> >> 0000000000000034
> >>        R6:  00003ffff81eb2e4    R7:  000000000000021e    R8:
> >> 0000000000000000
> >>        R9:  0000000000000000    R10: 0000000000000000    R11:
> >> 0000000000000000
> >>        R12: 0000000000000000    R13: 00003fffb6497730    R14:
> >> 0000000000000000
> >>        R15: 0000000000000000    R16: 000001000b7dad70    R17:
> >> 000000005dfd3c08
> >>        R18: 000000005dfd2838    R19: 00003ffff81eb620    R20:
> >> 000000005df74128
> >>        R21: 000001000b7d89a0    R22: 000000000000de4c    R23:
> >> 000000005df73b30
> >>        R24: 000000005dfd3c88    R25: 00003ffff81eb428    R26:
> >> 00003ffff81eb430
> >>        R27: 00003ffff81eb420    R28: 00003ffff81eb424    R29:
> >> 00003ffff81eb2e4
> >>        R30: 000001000b80e3c0    R31: 0000000000000034
> >>        NIP: 00003fffb6ac8400    MSR: 800000000000d033    OR3:
> >> 0000000000000003
> >>        CTR: 0000000000000000    LR:  000000005df1c3e4    XER:
> >> 0000000000000000
> >>        CCR: 0000000044004824    MQ:  0000000000000001    DAR:
> >> 00003fffb729c590
> >>        DSISR: 000000000a000000     Syscall Result: 0000000000000000
> >>
> >>
> >> instead of this..
> >>
> >>       PID: 2121   TASK: c0000001af90c600  CPU: 2   COMMAND: "sshd"
> >>        #0 [c0000001ae047a30] (null) at c0000000fd783c00  (unreliable)
> >>        #1 [c0000001ae047a70] avc_has_perm at c0000000003e5280
> >>        #2 [c0000001ae047b60] sock_has_perm at c0000000003e6238
> >>        #3 [c0000001ae047be0] security_socket_sendmsg at c0000000003e28fc
> >>        #4 [c0000001ae047c30] sock_sendmsg at c00000000072d53c
> >>        #5 [c0000001ae047c60] sock_write_iter at c00000000072d644
> >>        #6 [c0000001ae047d00] __vfs_write at c0000000002ed97c
> >>        #7 [c0000001ae047d90] vfs_write at c0000000002ef328
> >>        #8 [c0000001ae047de0] sys_write at c0000000002f0f00
> >>        #9 [c0000001ae047e30] system_call at c00000000000b184
> >>        System Call [c00] exception frame:
> >>        R0:  0000000000000004    R1:  00003ffff81eb220    R2:
> >> 00003fffb6b99800
> >>        R3:  0000000000000003    R4:  000001000b80e3c0    R5:
> >> 0000000000000034
> >>        R6:  00003ffff81eb2e4    R7:  000000000000021e    R8:
> >> 0000000000000000
> >>        R9:  0000000000000000    R10: 0000000000000000    R11:
> >> 0000000000000000
> >>        R12: 0000000000000000    R13: 00003fffb6497730    R14:
> >> 0000000000000000
> >>        R15: 0000000000000000    R16: 000001000b7dad70    R17:
> >> 000000005dfd3c08
> >>        R18: 000000005dfd2838    R19: 00003ffff81eb620    R20:
> >> 000000005df74128
> >>        R21: 000001000b7d89a0    R22: 000000000000de4c    R23:
> >> 000000005df73b30
> >>        R24: 000000005dfd3c88    R25: 00003ffff81eb428    R26:
> >> 00003ffff81eb430
> >>        R27: 00003ffff81eb420    R28: 00003ffff81eb424    R29:
> >> 00003ffff81eb2e4
> >>        R30: 000001000b80e3c0    R31: 0000000000000034
> >>        NIP: 00003fffb6ac8400    MSR: 800000000000d033    OR3:
> >> 0000000000000003
> >>        CTR: 0000000000000000    LR:  000000005df1c3e4    XER:
> >> 0000000000000000
> >>        CCR: 0000000044004824    MQ:  0000000000000001    DAR:
> >> 00003fffb729c590
> >>        DSISR: 000000000a000000     Syscall Result: 0000000000000000
> >>
> >>
> >> On second thought, that may not be bad after all??
> >> So, I am ok with the change you propose.
> > Hmmm, except that in the "sshd" sample showing the firmware-generated
> > eframe,
> > and which the task was presumably running in kernel space when firmware
> > took
> > over (?), it has a userspace NIP of 00003fffb6ac8400.  What's happening
> > there?
> >
> 
> IIUC, NIP 00003fffb6ac8400 must have caused the exception (system call
> in this case),
> and the backtrace shows the kernel call stack following the system call?
> 
> Thanks
> Hari
> 

All right, I'll check in your v2 patch along with the one-line addition to
display the exception frame.  For now we'll skip the NIP and LR translation,
since in the case above, it's somewhat confusing.

Thanks,
  Dave

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility