On 01/04/2014 05:05 PM, Fabian Knorr wrote: > Hi, Phil, > > thank you very much for your reply. > >> Side note: If you have a live spare available for a raid5, there's no >> good reason not to reshape to a raid6, and very good reasons to do so. > > I was worried that RAID6 would incur a significant load on the CPU, > especially if one disk fails. The system is a single-core Intel Atom. It does add more load, especially when degraded. I guess it depends on your usage pattern. I would try it before I gave up on the idea. >> Device names are not guaranteed to remain identical from one boot to >> another. And often won't be if a removable device is plugged in at that >> time. The linux MD driver keeps identity data in the superblock that >> makes the actual device names immaterial. >> >> It is really important that we get a "map" of device names to drive >> serial numbers, and adjust all future operations to ensure we are >> working with the correct names. An excerpt from "ls -l >> /dev/disk/by-id/" would do. And you need to verify it after every boot >> until this crisis is resolved. > > See the attachment "partitions". I grep'ed for raid partitions. > >> 1) raid.status appears to be from *after* your --add attempts. That >> means anything in those reports from those devices is useless. So we >> will have to figure out what that data was. > > Could it be that --add only changed the superblock of one disk, > namely /dev/sdb in file from my first e-mail? /dev/sda actually. >> 2) You attempted to recreate the array. If you left out >> "--assume-clean", your data is toast. Please show the precise command >> line you used in your re-create attempt. Also generate a fresh >> "raid.status" for the current situation. > > The only commands I used were --add /dev/sdb, --run, --assemble --scan, > --assemble --scan --force and --stop. I didn't try to re-create it, at > least not now. Also, the timestamp from raid.status (2011) is incorrect, > the array was re-created from scratch in the summer of 2012. I can't > tell why disks other than /dev/sdb1 have an invalid superblock. This is very good news. In fact, I think --assemble --force can still be made to work.... >> 3) The array seems to think it's member devices were /dev/sda through >> /dev/sdh (not in that order). Your "raid.status" has /dev/sd[abcefghi], >> suggesting a rescue usb or some such is /dev/sdd. > > Yes, that's correct. Very good. >> 4) Please describe the structure of the *content* of the array, so we >> can suggest strategies to *safely* recognize when our future attempts to >> --create --assume-clean have succeeded. LVM? Partitioned? One big >> filesystem? > > I'm using the array as a physical volume for LVM. Ok. Try this: mdadm --stop /dev/md0 mdadm -Afv /dev/md0 /dev/sd[bcefghi]1 It leaves out /dev/sda, which appears to have been the spare in the original setup. If MD is happy after that, use fsck -n on your logical volumes to verify your FS integrity, and/or see the extent of the damage (little or none, I think). If that works, you can --add /dev/sda1 again, and it will become the spare again. If it doesn't work, show everything printed by "mdadm -Afv" above. HTH, Phil -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html