Hi, I tracked this one down to 88a2a4ac6b671a4b0dd5d2d762418904c05f4104 (percpu data: only iterate over possible CPUs). I don't know if this is the correct way to fix this, but the following patch makes the problem go away for me. --- a/kernel/sched.c +++ b/kernel/sched.c @@ -6021,7 +6021,7 @@ void __init sched_init(void) runqueue_t *rq; int i, j, k; - for_each_cpu(i) { + for (i = 0; i < NR_CPUS; i++) { prio_array_t *array; rq = cpu_rq(i); Any other suggestions, how to fix this? Thanks, Rojhalat Ibrahim Mark E Mason wrote: > [Cross-posted from LKML] > > Hello all, > > Working from the linux-mip.org repository (which just recently merged > from the kernel.org repository), we've been getting exceptions on > several different processors due to NULL pointer dereferences in > sched.c. These happen on SMP systems only (but both 32 and 64-bit > systems trigger this problem). > > The Oops output and surrounding text (w/ backtrace) is below. What I've > traced is down to so far is that enqueue_task() gets called with a ready > queue (rq) where (rq->active == NULL). > > Backtracing a bit, the following patch triggers an earlier, slightly > more controlled failure: > > [mason@hawaii linux.git]$ git diff kernel/sched.c diff --git > a/kernel/sched.c b/kernel/sched.c > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -1264,6 +1264,7 @@ static int try_to_wake_up(task_t *p, uns #endif > > rq = task_rq_lock(p, &flags); > + BUG_ON(rq->active == NULL); > old_state = p->state; > if (!(old_state & state)) > goto out; > > > My question is, is the above assert valid (ie. Should rq->active always > be non-NULL at this point)? It seems like it should be, but I'm pretty > new to this code, and thought I should double-check before going off > into the weeds. > > If anyone has any ideas about where specifically to look for the > underlying problem, I'd appreciate it. > > Thanks (very much) in advance, > Mark Mason > mason@xxxxxxxxxxxx > Newberg, Oregon >