----- Original Message ----- > > ----- Original Message ----- > > Hello, > > > > I just noticed that on ppc64le, sometimes "bt" cannot find the stack > > info of current process. For example, there is a vmcore captured by > > kdump on a ppc64le system, which running with a kernel version 3.10. The > > vmcore was captured when kernel oopsed. There is no stack info found by > > bt: > > Hello Han, > > I've never worked on the backtrace code for ppc64, as it was written > by (and maintained by) IBM. From the debug messages, what happened is > that the starting IP/SP hooks are not being found. The crash command > sequence presumably looks like this: > > cmd_bt > back_trace > get_kdump_regs > get_netdump_regs > get_netdump_regs_ppc64 (should setup bt->machdep to point to NT_PRSTATUS note) > ppc64_get_stack_frame > ppc64_get_dumpfile_stack_frame > ppc64_kdump_stack_frame (should get IP/SP pair based upon NT_PRSTATUS note contents) > ppc64_back_trace_cmd > ppc64_back_trace > > ppc64_kdump_stack_frame() should pull the starting NIP/KSP values from the > pt_regs structure in the per-cpu NT_PRSTATUS note, but it appears that it is not, > leaving the registers at their initialized values of NULL. > > This causes the failure later on when ppc64_back_trace_cmd() is called, and which > prints the "=> PC: 0 () FP: 0" debug message shown below, and later on ppc64_back_trace() > prints the "cannot find the stack info." debug message. > > Without the dumpfile, I can't offer much else. Can you verify the crash utility > stack trail above, and if it is as I suspect, figure out why ppc64_kdump_stack_frame() > is failing? Or what other path it is taking? Actually, if this is a compressed kdump, ppc64_kdump_stack_frame() will not be called, and the register access is done inside ppc64_get_dumpfile_stack_frame(). The ppc64_get_dumpfile_stack_frame() function first grabs the registers from the pt_regs structure in the per-cpu NT_PRSTATUS note, but then also checks the hard and soft IRQ stacks, and the hardware interrupt stack, for known instances of kernel dump functions, which would override the pt_regs contents. If nothing is found on those stacks, the registers from the NT_PRSTATUS note are used. Dave > > > > > crash 7.0.9-2.ael7b > > Copyright (C) 2002-2014 Red Hat, Inc. > > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > > Copyright (C) 1999-2006 Hewlett-Packard Co > > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > > Copyright (C) 2005, 2011 NEC Corporation > > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > > This program is free software, covered by the GNU General Public License, > > and you are welcome to change it and/or distribute copies of it under > > certain conditions. Enter "help copying" to see the conditions. > > This program has absolutely no warranty. Enter "help warranty" for > > details. > > > > GNU gdb (GDB) 7.6 > > Copyright (C) 2013 Free Software Foundation, Inc. > > License GPLv3+: GNU GPL version 3 or later > > <http://gnu.org/licenses/gpl.html> > > This is free software: you are free to change and redistribute it. > > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > > and "show warranty" for details. > > This GDB was configured as "powerpc64le-unknown-linux-gnu"... > > > > KERNEL: /usr/lib/debug/lib/modules/3.10.0-221.ael7b.ppc64le/vmlinux > > DUMPFILE: /var/crash/127.0.0.1-2015.01.15-22:19:14/vmcore [PARTIAL > > DUMP] > > CPUS: 16 > > DATE: Thu Jan 15 21:18:16 2015 > > UPTIME: 17:53:43 > > LOAD AVERAGE: 213.58, 213.23, 212.70 > > TASKS: 1383 > > NODENAME: thymelp2.isst.aus.stglabs.ibm.com > > RELEASE: 3.10.0-221.ael7b.ppc64le > > VERSION: #1 SMP Wed Jan 7 09:27:09 EST 2015 > > MACHINE: ppc64le (3425 Mhz) > > MEMORY: 15 GB > > PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log > > for > > details) > > PID: 1970 > > COMMAND: "cat" > > TASK: c0000003130874a0 [THREAD_INFO: c00000005069c000] > > CPU: 5 > > STATE: TASK_RUNNING (PANIC) > > > > crash> set debug 99 > > debug: 99 > > crash> bt > > PID: 1970 TASK: c0000003130874a0 CPU: 5 COMMAND: "cat" > > GETBUF(16384 -> 0) > > <readmem: c00000005069c000, KVADDR, "stack contents", 16384, (ROE), > > 10a81570> > > <read_diskdump: addr: c00000005069c000 paddr: 5069c000 cnt: 16384> > > read_diskdump: paddr/pfn: 5069c000/5069 -> cache physical page: 50690000 > > c00000005069c018: do_no_restart_syscall > > c00000005069e870: blk_throtl_bio+240 > > c00000005069e990: clone_endio > > c00000005069ea00: generic_make_request_checks+836 > > c00000005069eab8: hardware_interrupt_common+128 > > c00000005069eac0: generic_make_request+36 > > c00000005069eb10: mempool_alloc_slab+36 > > c00000005069eb30: mempool_alloc+256 > > c00000005069eb50: mempool_alloc_slab+36 > > c00000005069ebc0: get_request+948 > > c00000005069ec00: __split_and_process_bio+1408 > > c00000005069ec20: autoremove_wake_function > > c00000005069ec80: find_busiest_group+544 > > c00000005069edf0: load_balance+684 > > c00000005069ee10: blk_throtl_bio+240 > > c00000005069ee70: find_busiest_group+544 > > c00000005069eee0: dequeue_task_fair+968 > > c00000005069ef30: clone_endio > > c00000005069ef50: get_page_from_freelist+1436 > > c00000005069f0a0: pSeries_cause_ipi_mux+112 > > c00000005069f0c0: smp_send_reschedule+164 > > c00000005069f0e0: default_wake_function+708 > > c00000005069f160: __wake_up_locked+116 > > c00000005069f1b0: ep_poll_callback+444 > > c00000005069f250: run_posix_cpu_timers+104 > > c00000005069f2c0: hvterm_raw_put_chars+64 > > c00000005069f2e0: hvc_console_print+336 > > c00000005069f3a8: initial_stab+2048 > > c00000005069f3b0: crash_save_cpu+252 > > c00000005069f488: cik_cp_resume+13476 > > c00000005069f490: dev_get_drvdata > > c00000005069f580: default_machine_kexec+332 > > c00000005069f610: pSeries_machine_kexec+60 > > c00000005069f680: machine_kexec+56 > > c00000005069f6a0: crash_kexec+312 > > c00000005069f6f0: dev_attr_show+64 > > c00000005069f748: cik_cp_resume+13476 > > c00000005069f750: dev_get_drvdata > > c00000005069f7f0: radeon_hwmon_show_temp+72 > > c00000005069f800: slb_miss_realmode+80 > > c00000005069f808: dev_get_drvdata > > c00000005069f810: radeon_hwmon_show_temp+32 > > c00000005069f890: die+840 > > c00000005069f930: bad_page_fault+224 > > c00000005069f948: radeon_hwmon_show_temp+72 > > c00000005069f9a0: handle_page_fault+44 > > c00000005069fa00: dev_attr_show+64 > > c00000005069fa58: cik_cp_resume+13476 > > c00000005069fa60: dev_get_drvdata > > c00000005069fb00: radeon_hwmon_show_temp+72 > > c00000005069fb10: slb_miss_realmode+80 > > c00000005069fb18: dev_get_drvdata > > c00000005069fb20: radeon_hwmon_show_temp+32 > > c00000005069fb60: handle_mm_fault+1724 > > c00000005069fb80: sysfs_open_file > > c00000005069fbd0: handle_page_fault+16 > > c00000005069fc90: alloc_pages_current+416 > > c00000005069fd00: dev_attr_show+64 > > c00000005069fd30: sysfs_read_file+220 > > c00000005069fde0: sys_read+304 > > c00000005069fe40: syscall_exit > > [3fffd0d6fe88] back_trace: > > task: c0000003130874a0 > > flags: 0 > > instptr: 0 > > stkptr: 0 > > bptr: 0 > > stackbase: c00000005069c000 > > stacktop: c0000000506a0000 > > tc: 1003c7b9fa8 (1970, c0000003130874a0) > > hp: 0 > > ref: 0 > > stackbuf: 10a81570 > > textlist: 0 > > frameptr: 0 > > call_target: none > > eframe_ip: 0 > > debug: 0 > > radix: 0 > > cpumask: 0 > > => PC: 0 () FP: 0 > > GETBUF(248 -> 1) > > GETBUF(1500 -> 2) > > cannot find the stack info. > > FREEBUF(2) > > FREEBUF(1) > > crash> > > > > > > Is this a problem? > > > > Thanks in advance! > > > > -- > > Crash-utility mailing list > > Crash-utility@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/crash-utility > > > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility