Hello, I work with a Red Hat AS kernel (2.6.9-11-smp) on a bi-proc AMD. I had a kernel panic this night, you will find an extract of the /var/log/messages in the attached file. The server is a backup one, and it was during really big batch processing. you will see too that's SMART seems wrong, the hdds are not so hot. I have looked at the code and all seems to be in fs/ext3. It "seems" that during an " ext3_ordered_writepage", the fs tries to walk along the page (walk_page_buffers) but he can't because the "page" is null. that's what the trace told me. My first idea is to correct it with something like this : if (!page) goto out_fail; But I feel that's not the good way or maybe my thought is wrong. Is there an ext3 maintener in the plane ? :) -- Loiseleur Michel - TM2L (08000LINUX) LINAGORA 27, rue de Berri 1er étage 75008 PARIS Tél : 01 58 18 68 28 Fax : 01 58 18 68 29 "Si hoc legere scis nimium eruditionis habes"
Aug 4 01:01:01 ju crond(pam_unix)[26634]: session opened for user root by (uid=0) Aug 4 01:01:19 ju crond(pam_unix)[26634]: session closed for user root Aug 4 01:03:50 ju smartd[1745]: Device: /dev/hdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 58 to 57 Aug 4 01:03:50 ju smartd[1745]: Device: /dev/hdc, SMART Usage Attribute: 194 Temperature_Celsius changed from 240 to 29 Aug 4 01:03:50 ju smartd[1745]: Device: /dev/hdc, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 58 to 57 Aug 4 01:03:50 ju smartd[1745]: Device: /dev/hdd, SMART Usage Attribute: 194 Temperature_Celsius changed from 196 to 203 Aug 4 01:06:01 ju crond(pam_unix)[6980]: session opened for user root by (uid=0) Aug 4 01:06:10 ju crond(pam_unix)[6980]: session closed for user root Aug 4 01:08:09 ju kernel: Unable to handle kernel paging request at virtual address 006c0070 Aug 4 01:08:09 ju kernel: printing eip: Aug 4 01:08:09 ju kernel: f891bc87 Aug 4 01:08:09 ju kernel: *pde = 00000000 Aug 4 01:08:09 ju kernel: Oops: 0000 [#1] Aug 4 01:08:09 ju kernel: SMP Aug 4 01:08:09 ju kernel: Modules linked in: nfsd exportfs lockd sunrpc basp(U) md5 ipv6 i2c_dev i2c_core dm_mod button battery ac hw_random e1000 floppy ext3 jbd raid1 aic7xxx sd_mod scsi_mod Aug 4 01:08:09 ju kernel: CPU: 0 Aug 4 01:08:09 ju kernel: EIP: 0060:[<f891bc87>] Tainted: P VLI Aug 4 01:08:09 ju kernel: EFLAGS: 00010202 (2.6.9-11.ELsmp) Aug 4 01:08:09 ju kernel: EIP is at walk_page_buffers+0x1e/0x87 [ext3] Aug 4 01:08:09 ju kernel: eax: c3ebd901 ebx: 00002000 ecx: 006c006c edx: c3ebd900 Aug 4 01:08:09 ju kernel: esi: 00002000 edi: c3ebd904 ebp: 00000000 esp: f7cb9e28 Aug 4 01:08:09 ju kernel: ds: 007b es: 007b ss: 0068 Aug 4 01:08:09 ju kernel: Process pdflush (pid: 34, threadinfo=f7cb9000 task=f7ca05f0) Aug 4 01:08:09 ju kernel: Stack: 006c006c 00001000 00000000 f4344438 c153e080 f4344438 c3ebd904 f4344438 Aug 4 01:08:09 ju kernel: f891c23b 00001000 00000000 f891c15d f7cb9f64 c153e080 f7cb9f64 c9671410 Aug 4 01:08:09 ju kernel: 0000000e c017336e 0000000d 00000000 00000001 ffffffff f891c17d 00000000 Aug 4 01:08:09 ju kernel: Call Trace: Aug 4 01:08:09 ju kernel: [<f891c23b>] ext3_ordered_writepage+0xbe/0x13a [ext3] Aug 4 01:08:09 ju kernel: [<f891c15d>] bget_one+0x0/0x7 [ext3] Aug 4 01:08:09 ju kernel: [<c017336e>] mpage_writepages+0x1c2/0x314 Aug 4 01:08:09 ju kernel: [<f891c17d>] ext3_ordered_writepage+0x0/0x13a [ext3] Aug 4 01:08:09 ju kernel: [<c0171ce8>] __sync_single_inode+0x5f/0x1c1 Aug 4 01:08:09 ju kernel: [<c017207c>] sync_sb_inodes+0x1a7/0x274 Aug 4 01:08:09 ju kernel: [<c01411ec>] pdflush+0x0/0x1e Aug 4 01:08:09 ju kernel: [<c01721da>] writeback_inodes+0x91/0xde Aug 4 01:08:09 ju kernel: [<c014089d>] background_writeout+0x65/0x97 Aug 4 01:08:09 ju kernel: [<c0141158>] __pdflush+0xec/0x180 Aug 4 01:08:09 ju kernel: [<c0141206>] pdflush+0x1a/0x1e Aug 4 01:08:09 ju kernel: [<c0140838>] background_writeout+0x0/0x97 Aug 4 01:08:09 ju kernel: [<c01411ec>] pdflush+0x0/0x1e Aug 4 01:08:09 ju kernel: [<c0132e31>] kthread+0x73/0x9b Aug 4 01:08:09 ju kernel: [<c0132dbe>] kthread+0x0/0x9b Aug 4 01:08:09 ju kernel: [<c01041f1>] kernel_thread_helper+0x5/0xb Aug 4 01:08:09 ju kernel: Code: 06 fb ff ff ff 31 c9 5a 89 c8 5b 5e c3 55 31 ed 57 89 d7 56 31 f6 53 83 ec 10 89 4c 24 08 89 d1 89 44 24 0c 8b 42 10 89 44 24 04 <8b> 41 04 89 04 24 8b 44 24 04 8d 1c 06 3b 5c 24 08 0f 96 c0 3b Aug 4 01:08:09 ju kernel: <0>Fatal exception: panic in 5 seconds Aug 4 08:32:18 ju syslogd 1.4.1: restart.