On Sun, Jan 23, 2011 at 2:44 PM, Arno Wagner <arno@xxxxxxxxxxx> wrote: > On Sat, Jan 22, 2011 at 07:41:18PM +0100, Stygge wrote: >> On Fri, 21 Jan 2011 at 13:46, Arno Wagner wrote: >> > : : >> # mdadm -D /dev/md0 >> /dev/md0: >> Version : 0.90 >> Creation Time : Wed Jan 19 22:16:47 2011 > > That does look bad, unless you really created the original array > on Wednsday last week. Looks more like the array was re-created > from the original disks, but not the original superblock. *ouch* > I suspect this could lead to a different parity disk or different > disk order in the array. It _is_ possibe that I am reading this > wrong, and the creation time gets updated on whatever recovery > was done. > > Question is how did the array recover? It clearly did not do > so by itself. Which distribution is this? No, it didn't reassemble itself until last boot - I first did a manual stop and assemble, just to see if it was some sort of "burp" in the driver - The distro is CentOS 5.5 (Final), kernel 2.6.18-194.26.1.el5.centos.plus, x86_64 >> # cat /sys/block/md0/md/mismatch_cnt >> 72 > > Wups, 72 mismatches on an array that was last synced a few > days ago? Maybe this is actually one or more dying disks. > Anything else than a zero result is bad. Uh-oh - now I'm getting really worried! > >> > If the RAID did not assemble again properly, >> > manual intervention and assembly may be necessary >> > in order to unlock and safe the data. >> >> mdadm --assemble --scan works fine and sets up the raid just fine. > > Yes. But after 2 disks were kicked form a RAID5 it should > not do that, unless you force it to. And if you force it, > and it guesses wrong about which disk was kicked first, > it will mix the state as the first disk was kicked with the > state when the second disk was kicked. And overwrite the disk > kicked second in the process. > > As far as I can see, this would result in a mixed disk state. > Everything written between the first and second kick would be > corrupt. However the key-slot would not be unless changed in > between. It might just have a wrong disk order. Again, if it > did assemble from the existing superblocks, the order will be > right. > >> I *really* hope that I'm not completely scr*w*d :-( > > Impossible to tell at this time. > > Ok, next steps: > > 1. Post or send me a long SMART status for each disk > (smartcl -a /dev/<disk>), these 72 inconsistencies > are not good and need to be looked at. > > 2. Can you give me the header backup (a bit more than 1MB)? > I cannot break your security that way, but looking at the > borders of the header and keyslot may tell me whether > the disk order was mixed up. This may allow reshuffeling > of the disks to the correct order. (If that is the problem.) > > The rpocedure would be to produce one or more reshuffeled > headers and give them back to you to see whether any allows > an unlock. If so, the next step would be to reshuffle the whole > array (which would still be inconsistent). > > 3. You could also give me the first 260kB of each disk, then > I can check whether any of the 72 inconsistencies is in the > key-slot area. Again, this does not allow me to break your > security. And again, this would allow to create an alternate > header that could work and allow you to unlock. I'll send these files off-list. > These are my buest guesses. You can do all that yourself as > well, refer to the FAQ for details of the on-disk LUKS > structure, and Wikipedia for RAID5 if you plan to. There > should also be information on how to interpret SMART data on > the web, although I have some long-term experience in that > area. I'll take you up on that offer as I'm quite the newbie with this sort of thing :-) /S _______________________________________________ dm-crypt mailing list dm-crypt@xxxxxxxx http://www.saout.de/mailman/listinfo/dm-crypt