Anyone? Bueller? Just had another of these crashes, after moving the disks and new memory to a new DL380G4 box. Starting to look very much like a kernel problem, but I do not know the best way to approach such a debugging task. Any advice is appreciated. -Alan Alan Sparks said: > Been having this problem, and posted about it before. Thinking it was a > memory issue, I've replaced all memory on the server. However, the > problem has continued. > > Server is a Proliant DL380 (8GB RAM, 2 Xeon CPU), running CentOS 3.6, all > patches up-to-date. Kernel is 2.4.21-40.ELsmp (problem seems to have > first manifested on kernel 2.4.21-37.0.1.ELsmp). Disk is CCISS hardware > RAID-5, straight partitioning (no LVM). > > This server runs an Oracle 10g instance, async I/O enabled to a NetApp > filer where most data is stored. > > Traceback of latest failure follows (two crashes this morning). Anyone > read these things well enough to tell me if there's any insight in this? > There are no nVidia drivers loaded, only stock kernel modules. > > Thanks in advance for any insight. > -Alan > > > Apr 10 05:32:22 db01-01 kernel: page not mapped. erroring out. > Apr 10 05:32:22 db01-01 kernel: Page has mapping still set. This is a > serious situation. However if you > Apr 10 05:32:22 db01-01 kernel: are using the NVidia binary only module > please report this bug to > Apr 10 05:32:22 db01-01 kernel: NVidia and not to the linux kernel > mailinglist. > Apr 10 05:32:22 db01-01 kernel: ------------[ cut here ]------------ > Apr 10 05:32:22 db01-01 kernel: kernel BUG at page_alloc.c:225! > Apr 10 05:32:22 db01-01 kernel: invalid operand: 0000 > Apr 10 05:32:22 db01-01 kernel: sg nfs lockd sunrpc tg3 microcode > keybdev mousedev hid input ehci-hcd usb-uhci usbcore ext3 jbd cciss > sd_mod scsi_mod > Apr 10 05:32:22 db01-01 kernel: CPU: 1 > Apr 10 05:32:22 db01-01 kernel: EIP: 0060:[<c0159560>] Not tainted > Apr 10 05:32:22 db01-01 kernel: EFLAGS: 00010286 > Apr 10 05:32:22 db01-01 kernel: > Apr 10 05:32:22 db01-01 kernel: EIP is at __free_pages_ok [kernel] 0x3e0 > (2.4.21-40.ELsmp/i686) > Apr 10 05:32:22 db01-01 kernel: eax: 00000033 ebx: c797dd38 ecx: > 00000001edx: c0387e98 > Apr 10 05:32:22 db01-01 kernel: esi: f4402880 edi: 00000000 ebp: > 00000000esp: cd7d5ec8 > Apr 10 05:32:22 db01-01 kernel: ds: 0068 es: 0068 ss: 0068 > Apr 10 05:32:22 db01-01 kernel: Process keventd (pid: 6, > stackpage=cd7d5000) > Apr 10 05:32:22 db01-01 kernel: Stack: c02c1ea8 00000363 c000a750 > ff0ea000 c0440280 00000000 cdbac000 efc21f00 > Apr 10 05:32:22 db01-01 kernel: 00000000 00000001 00000001 00000086 > dab95054 00000001 f4402880 00000000 > Apr 10 05:32:22 db01-01 kernel: 00000000 c014cf3e 00000001 00000000 > 00000000 cd7d4000 00000000 00000e00 > Apr 10 05:32:22 db01-01 kernel: Call Trace: [<c014cf3e>] __iodesc_free > [kernel] 0xde (0xcd7d5f0c) > Apr 10 05:32:22 db01-01 kernel: [<c0161e9c>] kmap_high [kernel] 0x5c > (0xcd7d5f28) > Apr 10 05:32:22 db01-01 kernel: [<c014d87b>] __iodesc_read_finish > [kernel] 0x22b (0xcd7d5f38) > Apr 10 05:32:22 db01-01 kernel: [<c01302ca>] __run_task_queue [kernel] > 0x6a (0xcd7d5f74) > Apr 10 05:32:22 db01-01 kernel: [<c013c9ad>] context_thread [kernel] > 0x13d (0xcd7d5f8c) > Apr 10 05:32:22 db01-01 kernel: [<c013c870>] context_thread [kernel] 0x0 > (0xcd7d5fe0) > Apr 10 05:32:22 db01-01 kernel: [<c01095cd>] kernel_thread_helper > [kernel] 0x5 (0xcd7d5ff0) > Apr 10 05:32:22 db01-01 kernel: > Apr 10 05:32:22 db01-01 kernel: Code: 0f 0b e1 00 33 17 2c c0 e9 6c fc > ff ff 9c 5a fa f0 fe 0d 70 > Apr 10 05:32:22 db01-01 kernel: > Apr 10 05:32:22 db01-01 kernel: Kernel panic: Fatal exception > > > > > =========== > Alan Sparks, UNIX/Linux Systems Administrator > <asparks@xxxxxxxxxxxxxxxx> > > _______________________________________________ > CentOS mailing list > CentOS@xxxxxxxxxx > http://lists.centos.org/mailman/listinfo/centos > > =========== Alan Sparks, UNIX/Linux Systems Administrator <asparks@xxxxxxxxxxxxxxxx>