----- Original Message ----- > > > > > Hello, > > > I am using crash version: 6.0.4-2.el6 on CentOS 6.3 (kernel > 2.6.32-279.el6.x86_64). I apologize for my newbie questions, but > googling did not help much. > > When analyzing a kernel dump, I am getting the following bt. > > crash> bt > PID: 12663 TASK: ffff88036304f500 CPU: 0 COMMAND: "bash" > #0 [ffff88035b949570] machine_kexec at ffffffff8103281b > #1 [ffff88035b9495d0] crash_kexec at ffffffff810ba662 > #2 [ffff88035b9496a0] oops_end at ffffffff81501290 > #3 [ffff88035b9496d0] no_context at ffffffff81043bab > #4 [ffff88035b949720] __bad_area_nosemaphore at ffffffff81043e35 > #5 [ffff88035b949770] bad_area at ffffffff81043f5e > #6 [ffff88035b9497a0] __do_page_fault at ffffffff81044710 > #7 [ffff88035b9498c0] do_page_fault at ffffffff8150326e > #8 [ffff88035b9498f0] page_fault at ffffffff81500625 > [exception RIP: ahaann+47] > RIP: ffffffffa06ce48f RSP: ffff88035b9499a8 RFLAGS: 00010246 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88035daef4e0 > RBP: ffff88035b9499b8 R8: 0000000004a47daf R9: ffffffffa06dae99 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000007 > R13: 00007fc82f4b8000 R14: 000000000000000a R15: 0000000000000000 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #9 [ffff88035b9499c0] ahaecho at ffffffffa06d2899 [ahadrv] > #10 [ffff88035b949a00] writectl at ffffffffa06c366e [ahadrv] > #11 [ffff88035b949e40] writeaha at ffffffffa06d3e7b [ahadrv] > #12 [ffff88035b949e60] proc_file_write at ffffffff811e6e44 > #13 [ffff88035b949ea0] proc_reg_write at ffffffff811e0abe > #14 [ffff88035b949ef0] vfs_write at ffffffff8117b068 > #15 [ffff88035b949f30] sys_write at ffffffff8117ba81 > #16 [ffff88035b949f80] system_call_fastpath at ffffffff8100b0f2 > RIP: 0000003a29ada3c0 RSP: 00007ffffaec6830 RFLAGS: 00010202 > RAX: 0000000000000001 RBX: ffffffff8100b0f2 RCX: 0000000000000065 > RDX: 000000000000000a RSI: 00007fc82f4b8000 RDI: 0000000000000001 > RBP: 00007fc82f4b8000 R8: 000000000000000a R9: 00007fc82f4aa700 > R10: 00000000fffffff7 R11: 0000000000000246 R12: 000000000000000a > R13: 0000003a29d8c780 R14: 000000000000000a R15: 0000000001e18460 > ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b > crash> > > > 1. Are the hex addr in [] right before the function name the stack > frame ptr for that function? On x86_64 machines, the "at <address>" shown is the address in that frame's function where the call instruction that it has made will return to. So for example, taking frame #15, where "sys_write at ffffffff8117ba81" has called vfs_write(), you can disassemble all instructions from the beginning of sys_write() to that address like this example: crash> dis -r ffffffff80016e6b 0xffffffff80016e26 <sys_write>: push %r13 0xffffffff80016e28 <sys_write+2>: mov %rsi,%r13 0xffffffff80016e2b <sys_write+5>: push %r12 0xffffffff80016e2d <sys_write+7>: mov $0xfffffffffffffff7,%r12 0xffffffff80016e34 <sys_write+14>: push %rbp 0xffffffff80016e35 <sys_write+15>: mov %rdx,%rbp 0xffffffff80016e38 <sys_write+18>: push %rbx 0xffffffff80016e39 <sys_write+19>: sub $0x18,%rsp 0xffffffff80016e3d <sys_write+23>: lea 0x14(%rsp),%rsi 0xffffffff80016e42 <sys_write+28>: callq 0xffffffff8000b5b4 <fget_light> 0xffffffff80016e47 <sys_write+33>: test %rax,%rax 0xffffffff80016e4a <sys_write+36>: mov %rax,%rbx 0xffffffff80016e4d <sys_write+39>: je 0xffffffff80016e86 <sys_write+96> 0xffffffff80016e4f <sys_write+41>: mov 0x38(%rax),%rax 0xffffffff80016e53 <sys_write+45>: lea 0x8(%rsp),%rcx 0xffffffff80016e58 <sys_write+50>: mov %rbp,%rdx 0xffffffff80016e5b <sys_write+53>: mov %r13,%rsi 0xffffffff80016e5e <sys_write+56>: mov %rbx,%rdi 0xffffffff80016e61 <sys_write+59>: mov %rax,0x8(%rsp) 0xffffffff80016e66 <sys_write+64>: callq 0xffffffff800164d0 <vfs_write> 0xffffffff80016e6b <sys_write+69>: mov %rax,%r12 crash> And the stack address of the frame contains that return address location. > > 2. I am assuming the panic occurred in function ahaann() (and not in > ahaecho() ). Is that right? That's correct. The exception occurred precisely when executing the instruction here: [exception RIP: ahadrv], which is at RIP ffffffffa06ce48f. You can do a "dis -r ahaann+47" to see the instructions leading up to the fatal one. If you load the ahadrv module with "mod -s ahadrv", you can also get line numbers interspersed with "dis -rl ahadrv+47" > > 3. What is puzzling me is why there is no frame associated with call > to ahaann(). Or is frame #8 associated to ahaann(). From the display > it seems frame #8 is associated to page_fault() since 0xffffffff81500625 > is an address in page_fault(). Or am totally misinterpreting the call stack. > > crash> dis ffffffff81500625 > 0xffffffff81500625 <page_fault+37>: jmpq 0xffffffff81500830 The ahaann() function didn't lay down a full frame because while it was executing, it took a page fault exception. As soon as that occurred, an exception frame was dumped onto the stack at that point (the register dump). Control at that point was transferred to page_fault() to handle the exception. Normally the exception should quietly resolve the page fault, return back to ahaann(), and the function should continue on. But the address that caused the page fault was bogus/unresolvable, so it never returned, but rather crashed the system. So again, what you should do is: crash> mod -s ahadrv (presuming you've got the kernel-debuginfo package installed) ... crash> dis -rl ahaann+47 And look at the last instruction shown. My guess is that it's referencing a location with a NULL pointer (probably via one of the NULL-filled RBX, RCX, RDX, RSI or RDI registers)? > > 4. I can understand the value of register dump for frame #8, due to > the panic. What is the significance of the register dump for frame > #16. Whenever a program running in user-space enters the kernel, it did so as the result of an exception, be it a system call, page fault, interrupt, etc. And like the in-kernel page fault exception, it lays down the user's register set at the top of the stack so they can be restored upon return to user-space. Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility