On Sep 15, 2009, at 3:22 PM, Matthias Urlichs wrote:
I had a somewhat strange error today. One of my servers has a RAID1 array. Two partitions at the end of the disk; the RAID superblocks are at the end of the partition. After a hard reboot today, one of the disks managed to not have its partition table scanned correctly, most probably because the disk washung and the ("intelligent") controller got confused about it. After theinitial scan, however, it came up correctly. This error caused mdadm to "successfully" build a RAID1 from /dev/sda3 and /dev/sdb (instead of /dev/sdb3). Needless to say, the resulting volume was somewhat unuseable. To say the least. My server's mdadm.conf has a 'DEVICE=partitions' line. I suppose thatreplacing these with a pattern that explicitly only matches partitions,not disks, would make the problem go away, and that the lesson fromtoday's disaster recovery effort is to always explicitly list the allowedpartition names, instead of being lazy and using 'DEVICE=partitions'.
Wrong lesson. The correct lesson to gather from this is to prefer version 1.1 or 1.2 superblocks wherever possible. Superblocks at the beginning of the device disappear when there is no partition table, superblocks at the end can be confused for superblocks belonging to the whole device when there is no partition table.
-- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: CFBFF194 http://people.redhat.com/dledford InfiniBand Specific RPMS http://people.redhat.com/dledford/Infiniband
Attachment:
PGP.sig
Description: This is a digitally signed message part