Re: Re: Re: Re: crash and sles 9 dumps (Dave Anderson)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dave Anderson wrote:
Daniel Li wrote:

Dave Anderson wrote:

Daniel Li wrote:

It seems the problem is not one with guest dump, but the version of SLES.

After upgrading my NATIVE SLES 9 system to SP 3, exactly the same problem happened while trying to use 'crash' on the live system, with a debug linux kernel ('vmlinux.dbg' below) built on the same system from matching 'kernel-source' package. (During this upgrade, the linux kernel changed from 2.6.5-7.97-smp to 2.6.5-7.244-smp, the same as that on the guest.)

Has anyone else seen this?



Did anything change in the task_struct between 2.6.5-7.97-smp and
2.6.5-7.244-smp?

Or, more likely, anything associated with the pidhash/pid_hash-related
code in the kernel?

Is the output of the crash command "help -t | grep refresh_task_table"
different when running against 2.6.5-7.97-smp vs. 2.6.5-7.244-smp?

Dave

The definition of task_struct between 2.6.5-7.97-smp and 2.6.5-7.244-smp did change. There is one new 8-bytes field called 'last_ran' before the list_head for tasks. This is what I don't get: why should it matter as long as the dump and debug kernel are using the same definition?


It shouldn't.

Does the output of "help -o task_struct" on the .97 vs the .244 kernels
reflect the member offset differences as you would expect? I.e., everything
(that's not -1) coming after the new last_ran member is bumped up by 8?

And are you sure there's nothing different w/respect to the pid_hash
declarations/usage?

Dave
>
>> The output of "help -t | grep refresh_task_table" didn't change.

The reason I ask about any pid_hash-related changes is because
over the years the manner of task table handling by the crash
utility has had to change to deal with the kernel changes.
The crash-internal tt->refresh_task_table function pointer
that you see in the "help -t" output gets set during task_init()
to one of these functions:

  static void refresh_fixed_task_table(void);
  static void refresh_unlimited_task_table(void);
  static void refresh_pidhash_task_table(void);
  static void refresh_pid_hash_task_table(void);
  static void refresh_hlist_task_table(void);
  static void refresh_hlist_task_table_v2(void);

with later kernels requiring the later function in the list above.

For a 2.6.5 vintage kernel, I'm guessing that when you did
the "help -t" it showed "refresh_pid_hash_task_table()"?

Anyway, in the two kernels that you are comparing, how is the
"pid_hash" variable declared in the kernel sources?  With
respect to the crash-internal setting of tt->refresh_task_table,
it should line up like so:

kernel:  static struct list_head pid_hash[PIDTYPE_MAX][PIDHASH_SIZE];
 crash:  refresh_pid_hash_task_table()

kernel: static struct hlist_head *pid_hash[PIDTYPE_MAX];
 crash: refresh_hlist_task_table()

kernel: static struct hlist_head *pid_hash;
 crash: refresh_hlist_task_table_v2()

For whatever reason it almost looks like the task-gathering is
using the wrong function, or maybe given back-ports and such,
the SUSE kernel task-handling is now a "hybrid" that would need
its own task-gathering function in the crash utility.

With respect to the "last_ran" addition, you could always rebuild
a kernel with that field moved to the end of the task_struct,
run that kernel, and see what happens.  If the "ps" task output
is still screwed up, then it should rule that out as the problem
at hand.

Dave







--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility

[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux