On Tue, Apr 28, 2009 at 02:46:45PM +0200, Manuel Lauss wrote: > >From time to time, my test systems don't boot correctly and spew the > following oops in futex_init(): > > calling init_timer_list_procfs+0x0/0x40 @ 1 > initcall init_timer_list_procfs+0x0/0x40 returned 0 after 29 usecs > calling futex_init+0x0/0xac @ 1 > Reserved instruction in kernel code[#1]: > Cpu 0 > $ 0 : 00000000 10003c00 00000000 00000001 > $ 4 : fffffff2 00000000 32e02014 00000000 > $ 8 : 00000000 00000000 c4653600 000000cd > $12 : 3b9aca00 000186a0 870ce3f0 0000000d > $16 : 32e02014 00000000 00000000 8042f0dc > $20 : 00000000 00000000 00000000 00000000 > $24 : 00000005 80243a3c > $28 : 87020000 87021f30 00000000 80100460 > Hi : 00000000 > Lo : 00000000 > epc : 8042f0f8 futex_init+0x1c/0xac > Not tainted > ra : 80100460 _stext+0x60/0x1c8 > Status: 10003c03 KERNEL EXL IE > Cause : 00808028 > PrId : 04030202 (Au1250) > Modules linked in: > Process swapper (pid: 1, threadinfo=87020000, task=87018000, tls=00000000) > Stack : 00000000 8042f0dc 00000001 00002543 0000001d 00000000 87021f00 8014f014 > 0000000e 00000000 8702a900 87002000 00003137 00000000 00000000 801ba18c > 8041e7a0 000000e0 80410000 00000000 00000000 8014f09c 32e02014 00000000 > 80448360 804484f4 00000000 00000000 00000000 80428304 00000000 00000000 > 00000000 00000000 87020000 00000000 00000000 80106ea4 10003c03 00000000 > ... > Call Trace: > [<8042f0f8>] futex_init+0x1c/0xac > [<80100460>] _stext+0x60/0x1c8 > [<80428304>] kernel_init+0x98/0x104 > [<80106ea4>] kernel_thread_helper+0x10/0x18 > > > Code: 30420004 14400008 2404fff2 <c0440000> 14800005 00000000 00000821 e0410000 1020fffa > Disabling lock debugging due to kernel taint > note: swapper[1] exited with preempt_count 1 > Kernel panic - not syncing: Attempted to kill init! > > > Disassembly of futex_init(): > > (gdb) disass 0x8042f0f8 > Dump of assembler code for function futex_init: > 0x8042f0dc <futex_init+0>: lw v1,20(gp) > 0x8042f0e0 <futex_init+4>: addiu v1,v1,1 > 0x8042f0e4 <futex_init+8>: sw v1,20(gp) > 0x8042f0e8 <futex_init+12>: lw v0,24(gp) > 0x8042f0ec <futex_init+16>: andi v0,v0,0x4 > 0x8042f0f0 <futex_init+20>: bnez v0,0x8042f114 <futex_init+56> > 0x8042f0f4 <futex_init+24>: li a0,-14 > 0x8042f0f8 <futex_init+28>: ll a0,0(v0) So this is in futex_atomic_cmpxchg_inatomic which has been inlined into futex_init. The epc is pointing to this LL instruction which is a legitimate MIPS32 instruction, so a reserved instruction exception does not make sense. However, a NULL pointer has intensionally been passed as the argument heres so this LL instruction will take a TLB exception, do_page_fault() will change the EPC to return to to point to the fixup handler which in the sources are these lines: " .section .fixup,\"ax\" \n" "4: li %0, %5 \n" " j 3b \n" " .previous \n" " .section __ex_table,\"a\" \n" " "__UA_ADDR "\t1b, 4b \n" " "__UA_ADDR "\t2b, 4b \n" " .previous \n" That's how it normally should function. If however in the exception handler something goes wrong while c0_status.exl is still set the c0_epc regiser won't be updated for the 2nd exception which is that reserved instruction exception. This sort of bug can be ugly to chase, I'm afraid. Ralf