----- Message from neilb@xxxxxxx --------- Date: Mon, 15 Oct 2007 09:31:23 +1000 From: Neil Brown <neilb@xxxxxxx> Reply-To: Neil Brown <neilb@xxxxxxx> Subject: Re: Help RAID5 reshape Oops / backup-file To: Nagilum <nagilum@xxxxxxxxxxx> Cc: linux-raid@xxxxxxxxxxxxxxx
On Sunday October 14, nagilum@xxxxxxxxxxx wrote:Can someone tell me if I'm on the right track? I've now noticed the following: # ~/mdadm-2.6.3/mdadm -v -A /dev/md0 /dev/sd[d-e] mdadm: looking for devices for /dev/md0 mdadm: /dev/sdd is identified as a member of /dev/md0, slot -1. mdadm: /dev/sde is identified as a member of /dev/md0, slot -1. mdadm: No suitable drives found for /dev/md0Hmm... that might be useful.. I just found your earlier email where you said:After the machine came back up (on a rescue disk) I thought I'd simply have to go through the process again. So I use add add the new disk again. Although that worked, I am now unable to resume the growing process.Using "add add" again was not correct, and should not have been possible. You should have simply assembled the array with the full new set of devices. Then reshape would have automatically restarted properly. Can you remember *exactly* what you did? If I can reproduce the situation, I can find the best way to fix it and send you something to try. NeilBrown
----- End message from neilb@xxxxxxx ----- Sure, here it goes: The system is running Debian Etch ia64, kernel 2.6.18,(since the exact versions might be important in this case I made copies of what I deemed to be relevant available online) a copy of the "linux/drivers/md" folder of that particular kernel can be found at:
http://www.nagilum.de/md/md Etch comes with mdadm-2.5.6 + Debian patches. See http://www.nagilum.de/md/mdadm-2.5.6/debian/changelog I made the whole Debian Package available here: http://www.nagilum.de/md/ - "mdadm-2.5.6" the extracted source with Debian patches applied - mdadm_2.5.6-9.diff.gz the diff to mdadm_2.5.6.orig.tar.gz- mdadm_2.5.6-9_i386.deb the i385 version of the package, however I was/am using mdadm_2.5.6-9_ia64.deb
- "mdadm_2.5.6-9.dsc" description file for building the .debThe Raid was being reshaped from three to five drives when the shutdown was issued. I assume the shutdown went normally since the machine was off and there was no power interruption.
Upon booting the system it became apparent that the RAID was non functional.The system boots off of a USB stick and then mounts its root filesystem from the RAID. Assembling the RAID happens within the initrd. The relevant scripts can be found here: http://www.nagilum.de/md/local-top/
I booted a rescue disk which is based on the identical Linux version.I looked at the "mdadm -Q --detail /dev/md0" output and saw only 3 of the 5 disks in the RAID. Then I did (what I should not have done) the add of the two new disks, assuming that mdadm will touch these in a harmful way (without using --force) and refuse to do so if that's not the way to add active disk.
The disks were added but the reshape did not continue.Up until now I can't think of anything else I did that could have changed something. (and "mdadm -Q --detail /dev/md0" looks the same ever since) I think, what I should have done instead of adding those disks would have been to either use --re-add and/or update /etc/mdadm/mdadm.conf. But then again I never expected this to become so problematic. :( By now I can also boot with 2.6.23 (I'll update to 2.6.23.1 shortly) and I have the latest mdadm tools (in parallel to the old ones). I also build the test_stripe utility and tried a very briefly the "test" argument, but it wanted me to specify an existing file so I chickened out. ;)
Thanks a lot for looking into this! Alex. ======================================================================== # _ __ _ __ http://www.nagilum.org/ \n icq://69646724 # # / |/ /__ ____ _(_) /_ ____ _ nagilum@xxxxxxxxxxx \n +491776461165 # # / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux # # /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux # # /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 # ======================================================================== ---------------------------------------------------------------- cakebox.homeunix.net - all the machine one needs..
Attachment:
pgpbLme224q9Z.pgp
Description: PGP Digital Signature