On Wed, Feb 09, 2005 at 01:17:48AM +0100, Christian wrote: > > maybe you can elaborate a bit more on the "corrupted journals": what does > "fsck" say, what's in the kernel log (during mount). if we know the > symptoms, perhaps someone can find the root of the problem... > I'm seeing the same behavior, but after only a few hours under heavy load and also with two new Hitachi SATA drives, showing as sda and sdb. System is Fedora Core 3 running 2.6.10-1.770_FC3. I had to use the "irqpoll" kernel option to not lock hard when the sata driver loads. >From /var/log/dmesg: SCSI subsystem initialized libata version 1.10 loaded. sata_sil version 0.8 ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11 PCI: setting IRQ 11 as level-triggered ACPI: PCI interrupt 0000:00:11.0[A] -> GSI 11 (level, low) -> IRQ 11 ata1: SATA max UDMA/100 cmd 0xE083A080 ctl 0xE083A08A bmdma 0xE083A000 irq 11 ata2: SATA max UDMA/100 cmd 0xE083A0C0 ctl 0xE083A0CA bmdma 0xE083A008 irq 11 irq 11: nobody cared (try booting with the "irqpoll" option. [<c013e0a0>] __report_bad_irq+0x2b/0x68 [<c013e169>] note_interrupt+0x73/0x96 [<c013d6cc>] __do_IRQ+0x1bd/0x249 [<c0104e04>] do_IRQ+0x5e/0x7a ======================= [<c01035b2>] common_interrupt+0x1a/0x20 [<c0120b50>] __do_softirq+0x2c/0x79 [<c0104edc>] do_softirq+0x38/0x3f ======================= [<c0104e16>] do_IRQ+0x70/0x7a [<c01035b2>] common_interrupt+0x1a/0x20 [<c020a182>] acpi_processor_idle+0xf1/0x1f6 [<c010108f>] cpu_idle+0x1f/0x34 [<c03a5665>] start_kernel+0x16b/0x16d handlers: [<e08a1be7>] (ata_interrupt+0x0/0x210 [libata]) Disabling IRQ #11 ata1: dev 0 cfg 49:2f00 82:74eb 83:7fea 84:4023 85:74e8 86:3c02 87:4023 88:203f ata1: dev 0 ATA, max UDMA/100, 488397168 sectors: lba48 ata1: dev 0 configured for UDMA/100 scsi0 : sata_sil ata2: dev 0 cfg 49:2f00 82:74eb 83:7fea 84:4023 85:74e8 86:3c02 87:4023 88:203f ata2: dev 0 ATA, max UDMA/100, 488397168 sectors: lba48 ata2: dev 0 configured for UDMA/100 scsi1 : sata_sil Vendor: ATA Model: HDS722525VLSA80 Rev: V36O Type: Direct-Access ANSI SCSI revision: 05 Vendor: ATA Model: HDS722525VLSA80 Rev: V36O Type: Direct-Access ANSI SCSI revision: 05 SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB) SCSI device sda: drive cache: write back SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB) SCSI device sda: drive cache: write back sda: sda1 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 SCSI device sdb: 488397168 512-byte hdwr sectors (250059 MB) SCSI device sdb: drive cache: write back SCSI device sdb: 488397168 512-byte hdwr sectors (250059 MB) SCSI device sdb: drive cache: write back sdb: sdb1 Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0 In the past 6 hours, I've recorded the following (grepped from dmesg with -i ext3): EXT3-fs error (device sdb1): ext3_new_block: Allocating block in system zone - block = 2588673 EXT3-fs error (device sdb1) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device sdb1) in ext3_prepare_write: Journal has aborted ext3_abort called. EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected aborted journal EXT3-fs error (device sdb1) in start_transaction: Journal has aborted EXT3-fs warning (device sdb1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure EXT3-fs warning (device sdb1): ext3_clear_journal_err: Marking fs in need of filesystem check. EXT3-fs warning: mounting fs with errors, running e2fsck is recommended EXT3 FS on sdb1, internal journal EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with journal data mode. EXT3-fs warning (device sdb1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure EXT3-fs warning (device sdb1): ext3_clear_journal_err: Marking fs in need of filesystem check. EXT3-fs warning: mounting fs with errors, running e2fsck is recommended EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #7618561: rec_len % 4 != 0 - offset=0, inode=1179011410, rec_len=58182, name_len=139 ext3_abort called. EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #7618561: rec_len % 4 != 0 - offset=0, inode=1179011410, rec_len=58182, name_len=139 EXT3-fs error (device sda1) in start_transaction: Journal has aborted EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #7618561: rec_len % 4 != 0 - offset=0, inode=1179011410, rec_len=58182, name_len=139 EXT3-fs error (device sda1): ext3_readdir: bad entry in directory #7618561: rec_len % 4 != 0 - offset=0, inode=1179011410, rec_len=58182, name_len=139 EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3 FS on hdg1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3 FS on hdh1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3-fs error (device sda1): ext3_free_blocks_sb: bit already cleared for block 15261701 EXT3-fs error (device sda1) in ext3_free_blocks_sb: Journal has aborted EXT3-fs error (device sda1) in ext3_free_blocks_sb: Journal has aborted EXT3-fs error (device sda1) in ext3_free_blocks_sb: Journal has aborted EXT3-fs error (device sda1) in ext3_free_blocks_sb: Journal has aborted EXT3-fs error (device sda1) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device sda1) in ext3_truncate: Journal has aborted EXT3-fs error (device sda1) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device sda1) in ext3_orphan_del: Journal has aborted EXT3-fs error (device sda1) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device sda1) in ext3_delete_inode: Journal has aborted ext3_abort called. EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3 FS on hdg1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3 FS on hdh1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3-fs error (device sdb1): ext3_add_entry: bad entry in directory #1982465: rec_len % 4 != 0 - offset=0, inode=1179011410, rec_len=46658, name_len=117 ext3_abort called. EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected aborted journal EXT3-fs error (device sdb1) in start_transaction: Journal has aborted EXT3-fs error (device sdb1) in ext3_create: IO failure EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with journal data mode. EXT3-fs error (device sdb1): ext3_new_block: Allocating block in system zone - block = 19431424 EXT3-fs error (device sdb1) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device sdb1) in ext3_prepare_write: Journal has aborted ext3_abort called. EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected aborted journal EXT3-fs error (device sdb1) in start_transaction: Journal has aborted EXT3-fs warning (device sdb1): ext3_clear_journal_err: Filesystem error recorded from previous mount: IO failure EXT3-fs warning (device sdb1): ext3_clear_journal_err: Marking fs in need of filesystem check. EXT3-fs warning: mounting fs with errors, running e2fsck is recommended EXT3 FS on sdb1, internal journal EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with journal data mode. EXT3 FS on hdf1, internal journal EXT3-fs: mounted filesystem with ordered data mode. EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. EXT3 FS on hdg1, internal journal EXT3-fs: mounted filesystem with ordered data mode. EXT3 FS on hdh1, internal journal EXT3-fs: mounted filesystem with ordered data mode. EXT3-fs warning (device sdb1): ext3_clear_journal_err: Filesystem error recorded from previous mount: error -87241522 EXT3-fs warning (device sdb1): ext3_clear_journal_err: Marking fs in need of filesystem check. EXT3-fs warning: mounting fs with errors, running e2fsck is recommended EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with ordered data mode. fsck was giving me more output and showing more errors earlier, but now it is unable to fully repair the FS and every run just reports block bitmap differences: root@servo:~$ fsck -fy /dev/sdb1 fsck 1.35 (28-Feb-2004) e2fsck 1.35 (28-Feb-2004) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: +(3966976--3966983) +(55412736--55412739) +55412743 +(55449602--55449603) +(55449606--55449607) Fix? yes /dev/sdb1: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sdb1: 343/30539776 files (2.3% non-contiguous), 25691648/61049000 blocks root@servo:~$ fsck -fy /dev/sdb1 fsck 1.35 (28-Feb-2004) e2fsck 1.35 (28-Feb-2004) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: +(3966976--3966983) +(55449600--55449607) Fix? yes /dev/sdb1: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sdb1: 343/30539776 files (2.3% non-contiguous), 25691648/61049000 blocks Any ideas? On an unrelated note, is the irqpoll option the cause of this oft-repeated message? Mar 19 05:38:58 servo kernel: hdc: cdrom_pc_intr: The drive appears confused (ireason = 0x01) --- Nitin Dahyabhai <nitind@xxxxxxxxx> _______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users