Re: crash and sles 9 GUEST dumps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Daniel Li wrote:
After finding out how to get crash working with native sles 9 LKCD format dumps -- namely, build and use a debug vmlinux with appropriate flags to feed to crash -- I started looking into using crash on kernel dumps created for sles 9 guest domains.

As compared to the LKCD format of native sles 9 dumps, those dumps are created using the new non-standard ELF format with section headers instead of program headers, which is the case with the xenctrl library now. Such formats are working for RHAS4U4 64bit guests, while I had to make minor modification to make it work for RHAS4U4 32bit guests as well. However, when it comes to sles 9 guests, crash seems to be having problems locating the stacks for each thread, with the exception of the CURRENT thread. (see below)

It may well be that the stack pointers were not saved properly for sles 9 guests by the Xen library in the dump. I'll take a look into the dump and the xen library code to see if that is the case... Or is this the case of crash not looking in the right places for those stack pointers?


Looking at the data below, this is hard to decipher what's going on.

The "ps" list -- except for the current task at ffffffff803d2800, shows
seemingly legitimate tasks because the COMM ("task_struct.comm[16]")
strings look OK.  But the state (ST) fields and the PPID values are
bogus?

> crash> ps
>   PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
>  >     0      0   0  ffffffff803d2800  RU   0.0 4399648058624
> 4389578663200  [<80>^L]
>      0      0   0  ffffffff803d2808  ??   0.0       0      0  [swapper]
>      1      0   0     10017e1f2c8    ??   0.1     640    304  init
>      2     -1   0     10017e1e9a8    ??   0.0       0      0  [migration/0]
>      3     -1   0     10017e1e088    ??   0.0       0      0  [ksoftirqd/0]
...

But the state (ST) field and the PPID values above are bogus.

And that's all confirmed when you ran the "task 10015180208" command,
which simply has gdb print the task_struct at that address:

> crash> bt 10015180208
> PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
> *bt: invalid kernel virtual address: 12  type: "stack contents"*
> bt: read of stack at 12 failed
> crash>  task 10015180208
> PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
> struct task_struct {
>  *state = 1099873050624,*
> *  thread_info = 0x12,*
>  usage = {
>    counter = 320
>  },
>  flags = 0,
...
>  comm = "klogd\000roc\000\000\000\000\000\000",
...

The "state" and "thread_info" (i.e., the stack page pointer) fields
make no sense, while the "comm" field, and many of the others (upon
a quick examination) do seem correct.

It's interesting that all of the task_struct addresses end in "8",
though.  If you were to enter "task_struct 10015180200", do those
two fields look right, and perhaps due to structure padding (?),
you'd still see the "klog" string in the correct place?

I'm sure this is something I've never seen before, so I'm afraid I
can't offer any answers or suggestions...

Dave



Thanks,
Daniel

/dumps/new/sles/64bit$ /home/dli/bin/crash vmlinux-2.6.5-7.244-smp vmlinux.dbg DUMP10.1.230.112

crash 4.0-4.5
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

WARNING: could not find MAGIC_START!
please wait... (gathering task table data)
crash: invalid kernel virtual address: 13  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"

crash: invalid kernel virtual address: 6  type: "fill_thread_info"

crash: invalid kernel virtual address: c  type: "fill_thread_info"

crash: invalid kernel virtual address: 13  type: "fill_thread_info"

crash: invalid kernel virtual address: c  type: "fill_thread_info"

crash: invalid kernel virtual address: 18  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"

crash: invalid kernel virtual address: 13  type: "fill_thread_info"

crash: invalid kernel virtual address: 13  type: "fill_thread_info"

crash: invalid kernel virtual address: 12  type: "fill_thread_info"

crash: invalid kernel virtual address: 11  type: "fill_thread_info"

crash: invalid kernel virtual address: 15  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"

crash: invalid kernel virtual address: 12  type: "fill_thread_info"

crash: invalid kernel virtual address: 6e  type: "fill_thread_info"

crash: invalid kernel virtual address: 22  type: "fill_thread_info"

crash: invalid kernel virtual address: 13  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"

crash: invalid kernel virtual address: c  type: "fill_thread_info"

crash: invalid kernel virtual address: c  type: "fill_thread_info"

crash: invalid kernel virtual address: 13  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"

crash: invalid kernel virtual address: 11  type: "fill_thread_info"

crash: invalid kernel virtual address: 10  type: "fill_thread_info"

crash: invalid kernel virtual address: c  type: "fill_thread_info"

crash: invalid kernel virtual address: 14  type: "fill_thread_info"

crash: invalid kernel virtual address: 13  type: "fill_thread_info"

crash: invalid kernel virtual address: 18  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"

crash: invalid kernel virtual address: e  type: "fill_thread_info"

crash: invalid kernel virtual address: f  type: "fill_thread_info"
please wait... (determining panic task)
bt: invalid kernel virtual address: 13  type: "stack contents"

bt: read of stack at 13 failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed


bt: invalid kernel virtual address: 6  type: "stack contents"

bt: read of stack at 6 failed


bt: invalid kernel virtual address: c  type: "stack contents"

bt: read of stack at c failed


bt: invalid kernel virtual address: 13  type: "stack contents"

bt: read of stack at 13 failed


bt: invalid kernel virtual address: c  type: "stack contents"

bt: read of stack at c failed


bt: invalid kernel virtual address: 18  type: "stack contents"

bt: read of stack at 18 failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed


bt: invalid kernel virtual address: 13  type: "stack contents"

bt: read of stack at 13 failed


bt: invalid kernel virtual address: 13  type: "stack contents"

bt: read of stack at 13 failed


bt: invalid kernel virtual address: 12  type: "stack contents"

bt: read of stack at 12 failed


bt: invalid kernel virtual address: 11  type: "stack contents"

bt: read of stack at 11 failed


bt: invalid kernel virtual address: 15  type: "stack contents"

bt: read of stack at 15 failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed


bt: invalid kernel virtual address: 12  type: "stack contents"

bt: read of stack at 12 failed


bt: invalid kernel virtual address: 6e  type: "stack contents"

bt: read of stack at 6e failed


bt: invalid kernel virtual address: 22  type: "stack contents"

bt: read of stack at 22 failed


bt: invalid kernel virtual address: 13  type: "stack contents"

bt: read of stack at 13 failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed


bt: invalid kernel virtual address: c  type: "stack contents"

bt: read of stack at c failed


bt: invalid kernel virtual address: c  type: "stack contents"

bt: read of stack at c failed


bt: invalid kernel virtual address: 13  type: "stack contents"

bt: read of stack at 13 failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed


bt: invalid kernel virtual address: 11  type: "stack contents"

bt: read of stack at 11 failed


bt: invalid kernel virtual address: 10  type: "stack contents"

bt: read of stack at 10 failed


bt: invalid kernel virtual address: c  type: "stack contents"

bt: read of stack at c failed


bt: invalid kernel virtual address: 14  type: "stack contents"

bt: read of stack at 14 failed


bt: invalid kernel virtual address: 13  type: "stack contents"

bt: read of stack at 13 failed


bt: invalid kernel virtual address: 18  type: "stack contents"

bt: read of stack at 18 failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed


bt: invalid kernel virtual address: e  type: "stack contents"

bt: read of stack at e failed


bt: invalid kernel virtual address: f  type: "stack contents"

bt: read of stack at f failed

     KERNEL: vmlinux-2.6.5-7.244-smp
DEBUG KERNEL: vmlinux.dbg (2.6.5-7.244-default)
   DUMPFILE: DUMP10.1.230.112
       CPUS: 1
       DATE: Thu Jul 26 14:34:46 2007
     UPTIME: 213503982284 days, 21:34:00
LOAD AVERAGE: 0.01, 0.12, 0.07
      TASKS: 34
   NODENAME: linux
    RELEASE: 2.6.5-7.244-smp
    VERSION: #1 SMP Mon Dec 12 18:32:25 UTC 2005
    MACHINE: x86_64  (2793 Mhz)
     MEMORY: 1015808 GB
      PANIC: ""
        PID: 0
    COMMAND: "
              "
       TASK: ffffffff803d2800  (1 of 2)  [THREAD_INFO: ffffffff80590000]
        CPU: 0
      STATE: TASK_RUNNING (ACTIVE)
    WARNING: panic task not found

crash> bt
PID: 0      TASK: ffffffff803d2800  CPU: 0   COMMAND: "<80>^L"
#0 [ffffffff80591ef0] schedule at ffffffff801394e4
#1 [ffffffff80591f98] default_idle at ffffffff8010f1c0
#2 [ffffffff80591fc8] cpu_idle at ffffffff8010f65a
crash> ps
  PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
> 0 0 0 ffffffff803d2800 RU 0.0 4399648058624 4389578663200 [<80>^L]
     0      0   0  ffffffff803d2808  ??   0.0       0      0  [swapper]
     1      0   0     10017e1f2c8    ??   0.1     640    304  init
     2     -1   0     10017e1e9a8    ??   0.0       0      0  [migration/0]
     3     -1   0     10017e1e088    ??   0.0       0      0  [ksoftirqd/0]
     4     -1   0     10001b712d8    ??   0.0       0      0  [events/0]
     5     -1   0     10001b709b8    ??   0.0       0      0  [khelper]
     6     -1   0     10001b70098    ??   0.0       0      0  [kacpid]
    25     -1   0     10017dd72e8    ??   0.0       0      0  [kblockd/0]
    47     -1   0     10017dd69c8    ??   0.0       0      0  [pdflush]
    48     -1   0     10017dd60a8    ??   0.0       0      0  [pdflush]
    49     -1   0     100178272f8    ??   0.0       0      0  [kswapd0]
    50     -1   0     100178269d8    ??   0.0       0      0  [aio/0]
  1295     -1   0     100178260b8    ??   0.0       0      0  [kseriod]
  2077     -1   0     10017897308    ??   0.0       0      0  [reiserfs/0]
  2744     -1   0     10014de9488    ??   0.0       0      0  [khubd]
  3077     -1   0     10015aa13c8    ??   0.2    2560    608  hwscand
  3693     -1   0     100164e1348    ??   0.2    3568    816  syslogd
  3696     -1   0     10015180208    ??   0.3    2744   1112  klogd
  3721     -1   0     10015b0eab8    ??   0.2    3536    628  resmgrd
  3722     -1   0     10015e6e1c8    ??   0.2    4564    640  portmap
  3803     -1   0     10015d49368    ??   0.6   20036   2340  master
  3814     -1   0     10015daea58    ??   0.6   20100   2312  pickup
  3815     -1   0     10016c5a0d8    ??   0.6   20144   2364  qmgr
  3861     -1   0     10016ca2a08    ??   0.7   26800   2932  sshd
  4022     -1   0     10014c42b48    ??   0.2    6804    924  cron
  4057     -1   0     100178960c8    ??   0.2    2484    612  agetty
  4058     -1   0     10016c5b318    ??   0.5   21864   1772  login
  4059     -1   0     10016ca3328    ??   0.2    7012    936  mingetty
  4060     -1   0     10015fb5398    ??   0.2    7012    936  mingetty
  4061     -1   0     10014cc6238    ??   0.2    7012    936  mingetty
  4062     -1   0     10015b0f3d8    ??   0.2    7012    936  mingetty
  4063     -1   0     100151e7458    ??   0.2    7012    936  mingetty
  4152     -1   0     10016a180f8    ??   0.8   12716   2992  bash
crash> bt 10015180208
PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
*bt: invalid kernel virtual address: 12  type: "stack contents"*
bt: read of stack at 12 failed
crash>  task 10015180208
PID: 3696   TASK: 10015180208       CPU: 0   COMMAND: "klogd"
struct task_struct {
 *state = 1099873050624,*
*  thread_info = 0x12,*
 usage = {
   counter = 320
 },
 flags = 0,
 ptrace = 502511173631,
 lock_depth = 120,
 prio = 0,
 static_prio = 1048832,
 run_list = {
   next = 0x200200,
   prev = 0x0
 },
 array = 0x50fe72e6,
 sleep_avg = 1,
 interactive_credit = 67616128664,
 timestamp = 67616128664,
 last_ran = 0,
 activated = 0,
 policy = 18446744073709551615,
 cpus_allowed = 18446744073709551615,
 time_slice = 150,
 first_time_slice = 0,
 tasks = {
   next = 0x10015b0eb48,
   prev = 0x100164e13d8
 },
 ptrace_children = {
   next = 0x100151802a8,
   prev = 0x100151802a8
 },
 ptrace_list = {
   next = 0x100151802b8,
   prev = 0x100151802b8
 },
 mm = 0x1001546c500,
 active_mm = 0x1001546c500,
 binfmt = 0xffffffff803e70c0,
 exit_state = 0,
 exit_code = 0,
 exit_signal = 17,
 pdeath_signal = 0,
 personality = 0,
 did_exec = 0,
 pid = 3696,
 tgid = 3696,
 real_parent = 0x10017e1f2c0,
 parent = 0x10017e1f2c0,
 children = {
   next = 0x10015180320,
   prev = 0x10015180320
 },
 sibling = {
   next = 0x10015b0ebe0,
   prev = 0x100164e1470
 },
 group_leader = 0x10015180200,
 pids = {{
     pid_chain = {
       next = 0x10015180370,
       prev = 0x10015180370
     },
     pidptr = 0x10015180360,
     pid = {
       nr = 3696,
       count = {
         counter = 1
       },
       task = 0x10015180200,
       task_list = {
         next = 0x10015180348,
         prev = 0x10015180348
       },
       hash_chain = {
         next = 0x10017827470,
         prev = 0x10016ca2b80
       }
     }
   }, {
     pid_chain = {
       next = 0x100151803b8,
       prev = 0x100151803b8
     },
     pidptr = 0x100151803a8,
     pid = {
       nr = 3696,
       count = {
         counter = 1
       },
       task = 0x10015180200,
       task_list = {
         next = 0x10015180390,
         prev = 0x10015180390
       },
       hash_chain = {
         next = 0x100178274b8,
         prev = 0x10016ca2bc8
       }
     }
   }, {
     pid_chain = {
       next = 0x10015180400,
       prev = 0x10015180400
     },
     pidptr = 0x100151803f0,
     pid = {
       nr = 3696,
       count = {
         counter = 1
       },
       task = 0x10015180200,
       task_list = {
         next = 0x100151803d8,
         prev = 0x100151803d8
       },
       hash_chain = {
         next = 0x10001949240,
         prev = 0x10016ca2c10
       }
     }
   }, {
     pid_chain = {
       next = 0x10015180448,
       prev = 0x10015180448
     },
     pidptr = 0x10015180438,
     pid = {
       nr = 3696,
       count = {
         counter = 1
       },
       task = 0x10015180200,
       task_list = {
         next = 0x10015180420,
         prev = 0x10015180420
       },
       hash_chain = {
         next = 0x10001949340,
         prev = 0x10016ca2c58
       }
     }
   }},
 wait_chldexit = {
   lock = {
     lock = 1
   },
   task_list = {
     next = 0x10015180470,
     prev = 0x10015180470
   }
 },
 vfork_done = 0x0,
 set_child_tid = 0x2a95894b90,
 clear_child_tid = 0x2a95894b90,
 rt_priority = 0,
 it_real_value = 0,
 it_prof_value = 0,
 it_virt_value = 0,
 it_real_incr = 0,
 it_prof_incr = 0,
 it_virt_incr = 0,
 real_timer = {
   entry = {
     next = 0x100100,
     prev = 0x200200
   },
   expires = 29143,
   lock = {
     lock = 1
   },
   magic = 1267182958,
   function = 0xffffffff80141b50 <it_real_fn>,
   data = 1099865522688,
   base = 0x0
 },
 utime = 0,
 stime = 4,
 cutime = 0,
 cstime = 0,
 nvcsw = 13,
 nivcsw = 2,
 cnvcsw = 0,
 cnivcsw = 0,
 start_time = 53888910424,
 min_flt = 105,
 maj_flt = 0,
 cmin_flt = 0,
 cmaj_flt = 0,
 uid = 0,
 euid = 0,
 suid = 0,
 fsuid = 0,
 gid = 0,
 egid = 0,
 sgid = 0,
 fsgid = 0,
 group_info = 0xffffffff803e2a00,
 cap_effective = 4294967039,
 cap_inheritable = 0,
 cap_permitted = 4294967039,
 keep_capabilities = 0,
 user = 0xffffffff803e29a0,
 rlim = {{
     rlim_cur = 18446744073709551615,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 18446744073709551615,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 18446744073709551615,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 8388608,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 0,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 18446744073709551615,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 3071,
     rlim_max = 3071
   }, {
     rlim_cur = 1024,
     rlim_max = 1024
   }, {
     rlim_cur = 18446744073709551615,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 18446744073709551615,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 18446744073709551615,
     rlim_max = 18446744073709551615
   }, {
     rlim_cur = 1024,
     rlim_max = 1024
   }, {
     rlim_cur = 819200,
     rlim_max = 819200
   }},
 used_math = 0,
 rcvd_sigterm = 0,
 oomkilladj = 0,
 comm = "klogd\000roc\000\000\000\000\000\000",
 link_count = 0,
 total_link_count = 0,
 sysvsem = {
   undo_list = 0x0
 },
 thread = {
   rsp0 = 1099873058120,
   rsp = 548682070920,
   userrsp = 182897429248,
   fs = 0,
   gs = 0,
   es = 0,
   ds = 0,
   fsindex = 0,
   gsindex = 0,
   debugreg0 = 0,
   debugreg1 = 0,
   debugreg2 = 0,
   debugreg3 = 0,
   debugreg6 = 0,
   debugreg7 = 0,
   cr2 = 0,
   trap_no = 0,
   error_code = 0,
   i387 = {
     fxsave = {
       cwd = 0,
       swd = 0,
       twd = 0,
       fop = 0,
       rip = 0,
       rdp = 281470681751424,
       mxcsr = 0,
       mxcsr_mask = 0,
st_space = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0},
xmm_space = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0},
padding = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
     }
   },
   ioperm = 0,
   io_bitmap_ptr = 0x0,
   tls_array = {0, 0, 0}
 },
 fs = 0x10014c7a180,
 files = 0x10001a114c0,
 namespace = 0x100154fb900,
 signal = 0x10015184600,
 sighand = 0x0,
 blocked = {
   sig = {0}
 },
 real_blocked = {
   sig = {1099865524632}
 },
 pending = {
   list = {
     next = 0x10015180998,
     prev = 0x0
   },
   signal = {
     sig = {0}
   }
 },
 sas_ss_sp = 0,
 sas_ss_size = 0,
 notifier = 0,
 notifier_data = 0x0,
 notifier_mask = 0x0,
 security = 0x600000005,
 parent_exec_id = 1,
 self_exec_id = 1,
 alloc_lock = {
   lock = 1
 },
 proc_lock = {
   lock = 0
 },
 switch_lock = {
   lock = 0
 },
 journal_info = 0x0,
 reclaim_state = 0x10015469180,
 proc_dentry = 0x0,
 backing_dev_info = 0x10015b40940,
 io_context = 0x0,
 ptrace_message = 0,
 last_siginfo = 0x0,
 io_wait = 0xac9,
 rchar = 2292,
 wchar = 3,
 syscr = 32,
 syscw = 475,
 acct_rss_mem1 = 2743,
 acct_vm_mem1 = 4,
 acct_stimexpd = 4294967297,
 ckrm_tsklock = {
   lock = 0
 },
 ckrm_celock = {
   lock = 0
 },
 ce_data = 0xffffffff804f3f20,
 taskclass = 0x100164e1bc8,
 taskclass_link = {
   next = 0x10015b0f338,
   prev = 0xffffffff80537940
 },
 cpu_class = 0x0,
 demand_stat = {
   run = 0,
   total = 61218488692,
   last_sleep = 32000000,
   recalc_interval = 0,
   cpu_demand = 105133020
 },
 delays = {
   waitcpu_total = 3647587,
   runcpu_total = 23603870,
   iowait_total = 0,
   mem_iowait_total = 4294967311,
   runs = 0,
   num_iowaits = 0,
   num_memwaits = 0,
   splpar_total = 1431654400
 },
 map_base = 0,
 mempolicy = 0x0,
 il_next = 0,
 audit = 0x10
}

crash>

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility


--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility

[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux