RAID failure already!!!!!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just got my RAID5 with three disks setup yesterday (hdb, hdc, hdd). I
created it with this command:
mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/hdb /dev/hdc dev/hdd
mdadm --detail --scan >> /etc/mdadm.conf
mkfs.ext3 -j -O dir_index /dev/md0
mount /dev/md0 /storage

I copied tons of stuff to the RAID array. Over night a disk failed
already and I got this:

Nov 16 02:49:36 storage kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Nov 16 02:49:36 storage kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=12244384, high=0, low=12244384, sector=12244384
Nov 16 02:49:36 storage kernel: ide: failed opcode was: unknown
Nov 16 02:49:36 storage kernel: end_request: I/O error, dev hdb, sector 12244384
Nov 16 02:49:36 storage kernel: raid5:md0: read error not correctable (sector 12244384 on hdb).
Nov 16 02:49:36 storage kernel: raid5: Disk failure on hdb, disabling device. Operation continuing on 1 devices
Nov 16 02:49:36 storage kernel: Buffer I/O error on device md0, logical block 3061092
Nov 16 02:49:36 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:36 storage kernel: Buffer I/O error on device md0, logical block 3068016
Nov 16 02:49:36 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:36 storage kernel: Buffer I/O error on device md0, logical block 3068000
Nov 16 02:49:36 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:36 storage kernel: Buffer I/O error on device md0, logical block 3068017
Nov 16 02:49:36 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:36 storage kernel: Buffer I/O error on device md0, logical block 3068001
Nov 16 02:49:39 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:44 storage kernel: Buffer I/O error on device md0, logical block 3068018
Nov 16 02:49:44 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:44 storage kernel: Buffer I/O error on device md0, logical block 3068002
Nov 16 02:49:44 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:44 storage kernel: Buffer I/O error on device md0, logical block 3068019
Nov 16 02:49:44 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:44 storage kernel: Buffer I/O error on device md0, logical block 3068003
Nov 16 02:49:44 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:44 storage kernel: Buffer I/O error on device md0, logical block 3068020
Nov 16 02:49:44 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:44 storage kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Nov 16 02:49:44 storage kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=12252592, high=0, low=12252592, sector=12252592
Nov 16 02:49:44 storage kernel: ide: failed opcode was: unknown
Nov 16 02:49:44 storage kernel: end_request: I/O error, dev hdb, sector 12252592
Nov 16 02:49:44 storage kernel: raid5:md0: read error not correctable (sector 12252592 on hdb).
Nov 16 02:49:44 storage kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Nov 16 02:49:44 storage kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=12256696, high=0, low=12256696, sector=12256696
Nov 16 02:49:44 storage kernel: ide: failed opcode was: unknown
Nov 16 02:49:44 storage kernel: end_request: I/O error, dev hdb, sector 12256696
Nov 16 02:49:44 storage kernel: raid5:md0: read error not correctable (sector 12256696 on hdb).
Nov 16 02:49:44 storage kernel: printk: 8920 messages suppressed.
Nov 16 02:49:44 storage kernel: Buffer I/O error on device md0, logical block 3064167
Nov 16 02:49:44 storage kernel: lost page write due to I/O error on md0
Nov 16 02:49:44 storage kernel: EXT3-fs error (device md0): read_block_bitmap: Cannot read block bitmap - block_group = 124, block_bitmap = 4063232
Nov 16 02:49:44 storage kernel: Aborting journal on device md0.
Nov 16 02:49:44 storage kernel: ext3_abort called.
Nov 16 02:49:44 storage kernel: EXT3-fs error (device md0): ext3_journal_start_sb: Detected aborted journal
Nov 16 02:49:44 storage kernel: Remounting filesystem read-only
Nov 16 02:49:44 storage kernel: EXT3-fs error (device md0) in ext3_prepare_write: IO failure
Nov 16 02:49:44 storage kernel: RAID5 conf printout:
Nov 16 02:49:44 storage kernel:  --- rd:3 wd:1 fd:2
Nov 16 02:49:44 storage kernel:  disk 0, o:0, dev:hdb
Nov 16 02:49:44 storage kernel:  disk 1, o:1, dev:hdc
Nov 16 02:49:44 storage kernel: RAID5 conf printout:
Nov 16 02:49:44 storage kernel:  --- rd:3 wd:1 fd:2
Nov 16 02:49:44 storage kernel:  disk 1, o:1, dev:hdc
Nov 16 02:49:44 storage kernel: __journal_remove_journal_head: freeing b_frozen_data
Nov 16 02:50:19 storage kernel: printk: 111 messages suppressed.
Nov 16 02:50:19 storage kernel: Buffer I/O error on device md0, logical block 1
Nov 16 02:50:19 storage kernel: lost page write due to I/O error on md0
Nov 16 02:50:19 storage kernel: Buffer I/O error on device md0, logical block 2981890
Nov 16 02:50:19 storage kernel: lost page write due to I/O error on md0
Nov 16 02:50:19 storage kernel: Buffer I/O error on device md0, logical block 3007502
Nov 16 02:50:19 storage kernel: lost page write due to I/O error on md0
Nov 16 02:50:19 storage kernel: Buffer I/O error on device md0, logical block 3047424
Nov 16 02:50:19 storage kernel: lost page write due to I/O error on md0
Nov 16 02:50:19 storage kernel: Buffer I/O error on device md0, logical block 3065192
Nov 16 02:50:19 storage kernel: lost page write due to I/O error on md0
Nov 16 02:50:19 storage kernel: Buffer I/O error on device md0, logical block 3066217
Nov 16 02:50:19 storage kernel: lost page write due to I/O error on md0
Nov 16 02:50:19 storage kernel: Buffer I/O error on device md0, logical block 3067242
Nov 16 02:50:19 storage kernel: lost page write due to I/O error on md0
Nov 16 03:16:25 storage smartd[1618]: Device: /dev/hdb, 3 Currently unreadable (pending) sectors
Nov 16 03:16:25 storage smartd[1618]: Sending warning via mail to root ...
Nov 16 03:16:25 storage smartd[1618]: Warning via mail to root: successful
Nov 16 03:46:30 storage smartd[1618]: Device: /dev/hdb, not capable of SMART self-check
Nov 16 03:46:30 storage kernel: hdb: irq timeout: status=0xd0 { Busy }
Nov 16 03:46:30 storage kernel: ide: failed opcode was: 0xb0
Nov 16 03:46:35 storage kernel: hda: status timeout: status=0xd0 { Busy }
Nov 16 03:46:35 storage kernel: ide: failed opcode was: unknown
Nov 16 03:46:35 storage kernel: hda: DMA disabled
Nov 16 03:46:35 storage kernel: hdb: DMA disabled
Nov 16 03:46:35 storage kernel: hda: drive not ready for command
Nov 16 03:46:36 storage kernel: ide0: reset: success
Nov 16 03:46:36 storage smartd[1618]: Sending warning via mail to root ...
Nov 16 03:46:36 storage smartd[1618]: Warning via mail to root: successful
Nov 16 03:46:46 storage smartd[1618]: Device: /dev/hdb, 3 Currently unreadable (pending) sectors
Nov 16 04:05:04 storage kernel: EXT3-fs error (device md0): ext3_get_inode_loc: unable to read inode block - inode=13123606, block=26247170
Nov 16 04:05:04 storage kernel: EXT3-fs error (device md0): ext3_get_inode_loc: unable to read inode block - inode=13123603, block=26247170
Nov 16 04:05:04 storage kernel: printk: 10 messages suppressed.
Nov 16 04:05:04 storage kernel: Buffer I/O error on device md0, logical block 1568
Nov 16 04:05:04 storage kernel: Buffer I/O error on device md0, logical block 1569
Nov 16 04:05:04 storage kernel: Buffer I/O error on device md0, logical block 1570
Nov 16 04:05:04 storage kernel: Buffer I/O error on device md0, logical block 1571
Nov 16 04:06:13 storage init: Trying to re-exec init
Nov 16 04:16:25 storage smartd[1618]: Device: /dev/hdb, 3 Currently unreadable (pending) sectors
Nov 16 04:46:31 storage smartd[1618]: Device: /dev/hdb, 3 Currently unreadable (pending) sectors
Nov 16 05:16:21 storage smartd[1618]: Device: /dev/hdb, 81 Currently unreadable (pending) sectors
Nov 16 05:46:22 storage smartd[1618]: Device: /dev/hdb, 230 Currently unreadable (pending) sectors
Nov 16 06:16:21 storage smartd[1618]: Device: /dev/hdb, 396 Currently unreadable (pending) sectors
Nov 16 06:46:20 storage smartd[1618]: Device: /dev/hdb, 708 Currently unreadable (pending) sectors
Nov 16 07:16:24 storage smartd[1618]: Device: /dev/hdb, 850 Currently unreadable (pending) sectors
Nov 16 07:46:36 storage smartd[1618]: Device: /dev/hdb, 1001 Currently unreadable (pending) sectors

How come the array is still not up and running on hdc and hdd? As a
matter of fact I don't see hdd anywhere in the log snip above. Am I
hosed already or is there any recovery for this? Unfortunately, I don't
have a backup for some of the data. 

Thanks,
James

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora Magazine]     [Fedora News]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [SSH]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux