Re: crash aborts with cannot determine idle task

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Look at the crash function get_idle_threads() in task.c, which is where
you're failing.  It runs through the history of the symbols that Linux
has used over the years for the run queues.  For the most recent kernels,
it looks for the "per_cpu__runqueues" symbol.  At least on 2.6.25-rc2,
the kernel still defines them in kernel/sched.c like this:

  static DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);

So if you do an "nm -Bn vmlinux | grep runqueues", you should see:

  # nm -Bn vmlinux-2.6.25-rc1-ext4-1 | grep runqueues
  ffffffff8082b700 d per_cpu__runqueues
  #

I'm guessing that's not the problem -- so presuming that the symbol *does*
exist, find out why it's failing to increment "cnt" in this part of
get_idle_threads():

       if (symbol_exists("per_cpu__runqueues") &&
            VALID_MEMBER(runqueue_idle)) {
                runqbuf = GETBUF(SIZE(runqueue));
                for (i = 0; i < nr_cpus; i++) {
if ((kt->flags & SMP) && (kt->flags & PER_CPU_OFF)) { runq = symbol_value("per_cpu__runqueues") +
                                        kt->__per_cpu_offset[i];
                        } else
runq = symbol_value("per_cpu__runqueues");

                        readmem(runq, KVADDR, runqbuf,
SIZE(runqueue), "runqueues entry (per_cpu)",
                                FAULT_ON_ERROR);
tasklist[i] = ULONG(runqbuf + OFFSET(runqueue_idle));
                        if (IS_KVADDR(tasklist[i]))
                                cnt++;
                }
        }

Determine whether it even makes it to the inner for loop, whether
the pre-determined nr_cpus value makes sense, whether the SMP flag
reflects whether the kernel was compiled for SMP, whether the PER_CPU_OFF
flag was set, what address was calculated, etc...

Dave

Thanks for the reply Dave. The code makes it to the inner for loop and the condition if (IS_KVADDR(tasklist[i])) fails which is why 'cnt' doesn't get incremented. The tasklist[i] somewhat has this value : 0x3d60657870722024.

I ran gdb on the vmcore file and printed the memory contents .

(gdb) print per_cpu__runqueues
$1 = {lock = {raw_lock = {slock = 1431524419}}, nr_running = 5283422954284598606, raw_weighted_load = 5064663116585906736, cpu_load = {2316051155752670036, 5929356451801411872,
   2613857225664584019}, nr_switches = 5644502509443686462,
nr_uninterruptible = 2316072106569976142, expired_timestamp = 5142904381182533935, timestamp_last_tick = 7235439831918129227, curr = 0x5f66696c650a5243, idle = 0x3d60657870722024, <<<----- prev_mm = 0x5243202b20243f60, active = 0xa247b4155535443, expired = 0x5352434449527d2f,


Does this mean that the kernel data was corrupted when vmcore was collected ?.

Thanks,
Chandru

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility

[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux