A machine I managed crashed today. The console (25 lines) indicated a kernel fault in an interrupt handler. After reboot, two pre-crash kernel oopses were in the log (see attached). The commonality between the two seems to be a spin_lock being already locked in mm/slab.c. The oops times were 10:43 and 11:01; the machine crashed somewhere between 11:01 and 11:10 judging by /var/log/cron (how can I get a more accurate time of crash?). This machine is running FC2 kernel 2.6.8-1.521 #1 Mon Aug 16 09:01:18 EDT 2004 i686 i686 i386. The machine had been up for about two weeks (when it was first put in service with this OS installation), and there was no apparent unusual activity on it today. It's an NFS fileserver. Any ideas? Is this likely a problem in the kernel, or with my hardware? Thanks, David
Oct 27 10:43:12 powraid kernel: Unable to handle kernel paging request at virtual address 00020004 Oct 27 10:43:12 powraid kernel: printing eip: Oct 27 10:43:12 powraid kernel: 02146bc4 Oct 27 10:43:12 powraid kernel: *pde = 00000000 Oct 27 10:43:12 powraid kernel: Oops: 0002 [#1] Oct 27 10:43:12 powraid kernel: Modules linked in: nfs nfsd exportfs lockd md5 ipv6 autofs4 sunrpc tulip 8139too mii floppy sg microcode xfs ohci_hcd ehci_hcd button battery asus_acpi ac ext3 jbd dm_mod mptscsih mptbase sd_mod scsi_mod Oct 27 10:43:12 powraid kernel: CPU: 0 Oct 27 10:43:12 powraid kernel: EIP: 0060:[<02146bc4>] Not tainted Oct 27 10:43:12 powraid kernel: EFLAGS: 00010006 (2.6.8-1.521) Oct 27 10:43:12 powraid kernel: EIP is at free_block+0x3a/0xbb Oct 27 10:43:12 powraid kernel: eax: 00020000 ebx: 0204b000 ecx: 0204b080 edx: 05f9d080 Oct 27 10:43:12 powraid kernel: esi: 0f754a80 edi: 0000001b ebp: 00000017 esp: 0c8fbacc Oct 27 10:43:12 powraid kernel: ds: 007b es: 007b ss: 0068 Oct 27 10:43:12 powraid kernel: Process nfsd (pid: 2006, threadinfo=0c8fb000 task=0c928cd0) Oct 27 10:43:12 powraid kernel: Stack: 0eaad090 0f754a80 0eaad090 0edac680 0f363d60 02146d1d 0000001b 0eaad080 Oct 27 10:43:12 powraid kernel: 0eaad080 0eaad090 0edac680 00000206 02147143 0edac6b4 0c8fbb34 00000008 Oct 27 10:43:12 powraid kernel: 00000000 0217e596 0edac6b4 0217e949 05f9d734 0000000a 02348c20 0217f079 Oct 27 10:43:12 powraid kernel: Call Trace: Oct 27 10:43:12 powraid kernel: [<02146d1d>] cache_flusharray+0xd8/0x15a Oct 27 10:43:12 powraid kernel: [<02147143>] kmem_cache_free+0x21/0x2f Oct 27 10:43:12 powraid kernel: [<0217e596>] destroy_inode+0x36/0x45 Oct 27 10:43:12 powraid kernel: [<0217e949>] dispose_list+0x4e/0x160 Oct 27 10:43:12 powraid kernel: [<0217f079>] prune_icache+0x366/0x3a3 Oct 27 10:43:12 powraid kernel: [<0217f0c3>] shrink_icache_memory+0xd/0x24 Oct 27 10:43:12 powraid kernel: [<0214957b>] shrink_slab+0xf9/0x15c Oct 27 10:43:12 powraid kernel: [<0214abf5>] try_to_free_pages+0xa9/0x14e Oct 27 10:43:12 powraid kernel: [<021422af>] __alloc_pages+0x1c8/0x2be Oct 27 10:43:12 powraid kernel: [<0213f916>] generic_file_aio_write_nolock+0x502/0x855 Oct 27 10:43:12 powraid kernel: [<109da583>] xfs_ichgtime+0xf4/0xfc [xfs] Oct 27 10:43:12 powraid kernel: [<109fec5a>] xfs_write+0x3e5/0x678 [xfs] Oct 27 10:43:12 powraid kernel: [<109fb43f>] linvfs_writev+0xdf/0x120 [xfs] Oct 27 10:43:12 powraid kernel: [<021d9ac7>] __copy_from_user_ll+0x41/0x4a Oct 27 10:43:12 powraid kernel: [<109fb360>] linvfs_writev+0x0/0x120 [xfs] Oct 27 10:43:12 powraid kernel: [<02160f4c>] do_readv_writev+0x15d/0x1de Oct 27 10:43:12 powraid kernel: [<02160aa5>] do_sync_write+0x0/0x99 Oct 27 10:43:12 powraid kernel: [<10a95449>] fh_verify+0x497/0x4af [nfsd] Oct 27 10:43:12 powraid kernel: [<109fb606>] linvfs_open+0x39/0x3d [xfs] Oct 27 10:43:12 powraid kernel: [<02161875>] open_private_file+0x9c/0xb7 Oct 27 10:43:12 powraid kernel: [<02161048>] vfs_writev+0x3d/0x41 Oct 27 10:43:12 powraid kernel: [<10a9711a>] nfsd_write+0x112/0x2a6 [nfsd] Oct 27 10:43:12 powraid kernel: [<0211a930>] __wake_up_common+0x36/0x5b Oct 27 10:43:12 powraid kernel: [<0211a9e2>] __wake_up+0x8d/0xf2 Oct 27 10:43:12 powraid kernel: [<10a9cff3>] nfsd3_proc_write+0xc7/0xde [nfsd] Oct 27 10:43:12 powraid kernel: [<10a9e6ae>] nfs3svc_decode_writeargs+0x0/0x159 [nfsd] Oct 27 10:43:12 powraid kernel: [<10a93a20>] nfsd_dispatch+0xbf/0x163 [nfsd] Oct 27 10:43:12 powraid kernel: [<10970ba9>] svc_process+0x323/0x562 [sunrpc] Oct 27 10:43:12 powraid kernel: [<10a93683>] nfsd+0x3ae/0x68c [nfsd] Oct 27 10:43:12 powraid kernel: [<10a932d5>] nfsd+0x0/0x68c [nfsd] Oct 27 10:43:12 powraid kernel: [<021041d9>] kernel_thread_helper+0x5/0xb Oct 27 10:43:12 powraid kernel: Code: 89 50 04 89 02 31 d2 2b 4b 0c c7 03 00 01 10 00 c7 43 04 00 Oct 27 10:43:13 powraid kernel: mm/slab.c:2725: spin_lock(mm/slab.c:0f754ac4) already locked by mm/slab.c/2141 Oct 27 11:01:00 powraid kernel: Unable to handle kernel paging request at virtual address 00020004 Oct 27 11:01:00 powraid kernel: printing eip: Oct 27 11:01:00 powraid kernel: 02146aa1 Oct 27 11:01:00 powraid kernel: *pde = 00000000 Oct 27 11:01:00 powraid kernel: Oops: 0002 [#2] Oct 27 11:01:00 powraid kernel: Modules linked in: nfs nfsd exportfs lockd md5 ipv6 autofs4 sunrpc tulip 8139too mii floppy sg microcode xfs ohci_hcd ehci_hcd button battery asus_acpi ac ext3 jbd dm_mod mptscsih mptbase sd_mod scsi_mod Oct 27 11:01:00 powraid kernel: CPU: 0 Oct 27 11:01:00 powraid kernel: EIP: 0060:[<02146aa1>] Not tainted Oct 27 11:01:00 powraid kernel: EFLAGS: 00010046 (2.6.8-1.521) Oct 27 11:01:00 powraid kernel: EIP is at cache_alloc_refill+0x154/0x23d Oct 27 11:01:00 powraid kernel: eax: 00020000 ebx: 0f754b80 ecx: 0cc75000 edx: 0f754b8c Oct 27 11:01:00 powraid kernel: esi: 00000008 edi: 0f754b8c ebp: 0f7ae080 esp: 0f267e08 Oct 27 11:01:00 powraid kernel: ds: 007b es: 007b ss: 0068 Oct 27 11:01:00 powraid kernel: Process bash (pid: 23727, threadinfo=0f267000 task=0d242640) Oct 27 11:01:00 powraid kernel: Stack: 00000050 00000050 0f754b80 00000246 0f709000 02146de5 0042da8f 0f709000 Oct 27 11:01:00 powraid kernel: 031f386c 108c69b3 0217e3f1 0042da8f 0f709000 031f386c 0042da8f 0217f5a0 Oct 27 11:01:00 powraid kernel: 0042da8f 02d1aeb8 0f709000 02ff60f0 108c40e3 062b92e8 108d3fc0 02d1aeb8 Oct 27 11:01:00 powraid kernel: Call Trace: Oct 27 11:01:00 powraid kernel: [<02146de5>] kmem_cache_alloc+0x46/0x4c Oct 27 11:01:00 powraid kernel: [<108c69b3>] ext3_alloc_inode+0xf/0x3c [ext3] Oct 27 11:01:00 powraid kernel: [<0217e3f1>] alloc_inode+0x13/0x182 Oct 27 11:01:00 powraid kernel: [<0217f5a0>] get_new_inode_fast+0xf/0x21f Oct 27 11:01:00 powraid kernel: [<108c40e3>] ext3_lookup+0x42/0x89 [ext3] Oct 27 11:01:00 powraid kernel: [<0216ffe8>] real_lookup+0x73/0xde Oct 27 11:01:00 powraid kernel: [<0217034a>] do_lookup+0x43/0x72 Oct 27 11:01:00 powraid kernel: [<02171039>] link_path_walk+0xcc0/0x1017 Oct 27 11:01:00 powraid kernel: [<0217160b>] path_lookup+0xff/0x12f Oct 27 11:01:00 powraid kernel: [<02171747>] __user_walk+0x21/0x51 Oct 27 11:01:00 powraid kernel: [<0216bac4>] vfs_stat+0x14/0x3a Oct 27 11:01:00 powraid kernel: [<0216c03f>] sys_stat64+0xf/0x23 Oct 27 11:01:00 powraid kernel: [<021181a7>] do_page_fault+0x0/0x489 Oct 27 11:01:00 powraid kernel: Code: 89 50 04 89 02 66 83 79 14 ff c7 01 00 01 10 00 c7 41 04 00 Oct 27 11:01:01 powraid kernel: mm/slab.c:2656: spin_lock(mm/slab.c:0f754bc4) already locked by mm/slab.c/1921