Re: out of sync raid 5 + xfs = kernel startup problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neil Brown wrote:

On Tuesday April 12, robey@xxxxxxxxxxxxxxxxxxx wrote:


My raid5 system recently went through a sequence of power outages. When everything came back on the drives were out of sync. No big issue... just sync them back up again. But something is going wrong. Any help is appreciated. dmesg provides the following (the network stuff is mixed in):




Here's the main problem.

You've got a degraded, unclean array. i.e. one drive is
failed/missing and md isn't confident that all the parity blocks are
correct due to an unclean shutdown (could have been in the middle of a
write). This means you could have undetectable data corruption.


md wants you to know this an not assume that everything is perfectly
OK.

You can still start the array, but you will need to use
 mdadm --assemble --force
which means you need to boot first ... got a boot CD?

I should add a "raid=force-start" or similar boot option, but I
haven't yet.

So, boot somehow, and
 mdadm --assemble /dev/md0 --force /dev/sd[a-f]2

 mdadm /dev/md0 -a /dev/sdd2

wait for sync to complete (not absolutely needed).

Reboot.


Thanks for the help. I rebooted using a rescue partition and used the two commands. After about 2 hours of synching the array decided that sdf had failed and ceased its synch. I restarted and then tried to assemble the array once again. sdd2 and sdf2 are now both marked as spares and the array had only 4/6 partitions... dead. Can I force the device numbers within the array? I know that sdd2 was position 5 and sdf2 was position 3. I'd like to save what I can... most of the data on the array can be reproduced... but it takes so much time.

If anyone is interested during my attempts to force the array to run I got a segfault in mdadm. I'll post a snippet here... ignore if it's old news.

md: pers->run() failed ...
Unable to handle kernel NULL pointer dereference at 0000000000000030 RIP:
<ffffffff802c9350>{md_error+64}

Again... thanks for any and all help.

-Robey
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux