Irritating RAID problem (kept spare, kicked data disks due to timestamp)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I found an interesting problem with software RAID 5 in 2.6.10:

I have a RAID 5 array, recently created with mdadm. It consists of 4 160 GB drives plus a spare. All 4 drives were active and fully synced when the box locked up due to some sort of hardware problem. When I rebooted, the kernel refused to start the array because all 4 drives had an older timestamp then the spare. So the RAID code kicked them out, one after another, until it was left with just a single spare disk. Since it can't start an array with 0/4 disks, it failed. I was able to repeat this with 2.6.10 and 2.6.2 (the only other kernel I had handy). Pulling the spare disk and rebooting fixed everything.

I don't have an record of the logs during this period--the box was in single-user mode with disk problems, and I didn't want to write anything to the disk.

Logically, it seems like the kernel's RAID recovery code shouldn't look for the newest disk, it should really look for a quorum, even if that means kicking out newer timestamps. *Especially* when the newer timestamp is the spare disk.


Scott

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux