Ok, since my previous thread didn't seem to attract much attention, let me try again. An interrupted RAID5 reshape will cause the md device in question to contain one corrupt chunk per stripe if resumed in the wrong manner. A testcase can be found at http://www.nagilum.de/md/ . The first testcase can be initialized with "start.sh" the real test can then be run with "test.sh". The first testcase also uses dm-crypt and xfs to show the corruption. The second testcase uses nothing but mdadm and "testpat" - a small program to write and verify a simple testpattern designed to find block data corruptions. Use "v2_start.sh && v2_test.sh" to run. At the end it will point out all the wrong bytes on the md device. I'm not just interested in a simple behaviour fix I'm also interested in what actually happens and if possible a repair program for that kind of data corruption.The bug is architectural agnostic. I first came across it using 2.6.23.8 on amd64 but I verified it on 2.6.23.[8-12] and 2.6.24-rc[5,6] on ppc. Always using mdadm 2.6.4.
The situation the bug first showed up was as follows: 1. A RAID5 reshape from 5->6 device was started.2. After about 4% one disk failed, the machine appeared unresponsive and was rebooted.
3. A spare disk was added to the array.4. The bad drive was re-added to the array in a different bay and the reshape resumed.
5. The drive failed again but the reshape continued.6. The reshaped finished and after that the resync. The data after at about 4% on the md device is broken as described above.
Kind regards, Alex. ======================================================================== # _ __ _ __ http://www.nagilum.org/ \n icq://69646724 # # / |/ /__ ____ _(_) /_ ____ _ nagilum@xxxxxxxxxxx \n +491776461165 # # / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux # # /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux # # /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 # ======================================================================== ---------------------------------------------------------------- cakebox.homeunix.net - all the machine one needs..
Attachment:
pgpa3bhje84sv.pgp
Description: PGP Digital Signature