On Fri, Dec 23 2016, Giuseppe Bilotta wrote: > Hello again, > > On Thu, Dec 8, 2016 at 8:02 PM, John Stoffel <john@xxxxxxxxxxx> wrote: >> >> Sorry for not getting back to you sooner, I've been under the weather >> lately. And I'm NOT an expert on this, but it's good you've made >> copies of the disks. > > Don't worry about the timing, as you can see I haven't had much time > to dedicate to the recovery of this RAID either. As you can see, it > was not that urgent ;-) > > >> Giuseppe> Here it is. Notice that this is the result of -E _after_ the attempted >> Giuseppe> re-add while the RAID was running, which marked all the disks as >> Giuseppe> spares: >> >> Yeah, this is probably a bad state. I would suggest you try to just >> assemble the disks in various orders using your clones: >> >> mdadm -A /dev/md0 /dev/sdc /dev/sdd /dev/sde /dev/sdf >> >> And then mix up the order until you get a working array. You might >> also want to try assembling using the 'missing' flag for the original >> disk which dropped out of the array, so that just the three good disks >> are used. This might take a while to test all the possible >> permutations. >> >> You might also want to look back in the archives of this mailing >> list. Phil Turmel has some great advice and howto guides for this. >> You can do the test assembles using loop back devices so that you >> don't write to the originals, or even to the clones. > > I've used the instructions on using overlays with dmsetup + sparse > files on the RAID wiki > https://raid.wiki.kernel.org/index.php/Recovering_a_damaged_RAID > to experiment with the recovery (and just to be sure, I set the > original disks read-only using blockdev; might be worth adding this to > the wiki). > > I also wrote a small script to test all combinations (nothing smart, > really, simply enumeration of combos, but I'll consider putting it up > on the wiki as well), and I was actually surprised by the results. To > test if the RAID was being re-created correctly with each combination, > I used `file -s` on the RAID, and verified that the results made > sense. I am surprised to find out that there are multiple combinations > that make sense (note that the disk names are shifted by one compared > to previous emails due a machine lockup that required a reboot and > another disk butting in to a different order): > > trying /dev/sdd /dev/sdf /dev/sde /dev/sdg > /dev/md111: Linux rev 1.0 ext4 filesystem data, > UUID=0031565c-38dd-4445-a707-f77aef1cbf7e, volume name "oneforall" > (needs journal recovery) (extents) (large files) (huge files) > > trying /dev/sdd /dev/sdf /dev/sdg /dev/sde > /dev/md111: Linux rev 1.0 ext4 filesystem data, > UUID=0031565c-38dd-4445-a707-f77aef1cbf7e, volume name "oneforall" > (needs journal recovery) (extents) (large files) (huge files) > > trying /dev/sde /dev/sdf /dev/sdd /dev/sdg > /dev/md111: Linux rev 1.0 ext4 filesystem data, > UUID=0031565c-38dd-4445-a707-f77aef1cbf7e, volume name "oneforall" > (needs journal recovery) (extents) (large files) (huge files) > > trying /dev/sde /dev/sdf /dev/sdg /dev/sdd > /dev/md111: Linux rev 1.0 ext4 filesystem data, > UUID=0031565c-38dd-4445-a707-f77aef1cbf7e, volume name "oneforall" > (needs journal recovery) (extents) (large files) (huge files) > > trying /dev/sdg /dev/sdf /dev/sde /dev/sdd > /dev/md111: Linux rev 1.0 ext4 filesystem data, > UUID=0031565c-38dd-4445-a707-f77aef1cbf7e, volume name "oneforall" > (needs journal recovery) (extents) (large files) (huge files) > > trying /dev/sdg /dev/sdf /dev/sdd /dev/sde > /dev/md111: Linux rev 1.0 ext4 filesystem data, > UUID=0031565c-38dd-4445-a707-f77aef1cbf7e, volume name "oneforall" > (needs journal recovery) (extents) (large files) (huge files) > : > So there are six out of 24 combinations that make sense, at least for > the first block. I know from the pre-fail dmesg that the g-f-e-d order > should be the correct one, but now I'm left wondering if there is a > better way to verify this (other than manually sampling files to see > if they make sense), or if the left-symmetric layout on a RAID6 simply > allows some of the disk positions to be swapped without loss of data. > You script has reported all arrangements with /dev/sdf as the second device. Presumably that is where the single block you are reading resides. To check if a RAID6 arrangement is credible, you can try the raid6check program that is include in the mdadm source release. There is a man page. If the order of devices is not correct raid6check will tell you about it. NeilBrown
Attachment:
signature.asc
Description: PGP signature