Neil Brown <neilb@xxxxxxx> writes: > On Saturday August 26, wferi@xxxxxxx wrote: > >> after an intermittent network failure, our RAID6 array of AoE devices >> can't run anymore. Looks like the system dropped each of the disks one >> after the other, and at the third the array failed as expected. >> Trying to assemble the array results in all disks going into spare >> status, nothing useful. The disks really must have been cut >> simultaneously, but their superblocks were probably altered since then >> by the recovery attempts. > > You say some of the drives are 'spare'. How did that happen? Did you > try to add them back to the array after it has failed? That is a > mistake. Surely it was, although not mine. > The thing to do at that point is > - stop the array > - make sure the network is back and the individual drives are > working > - use mdadm to assemble with --force. This should 'just work'. Probably it should have... > But if you used --add, then you will have destroyed info in the > superblock. That isn't the end of the world, but makes it a little > harder. > > The easiest thing to do is simply recreate the array, making sure to > have the drives in the correct order, and any options (like chunk > size) the same. This will not hurt the data (if done correctly). Thanks, that did it! Strangely (for me) mdadm -E doesn't report the chunk size, only mdadm -D does, which is not available prior assembly. Looks like it was left at the default 64k. I recreated the array with two drives missing to avoid triggering a resync, and added them afterwards. I wonder whether it makes any difference. Anyway, thanks a lot! -- Feri. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html