Re: RAID5 mdadm --grow wrote nothing (Reshape Status : 0% complete) and cannot assemble anymore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It seems that the problem didn't happened after upgrading to Debian 10 (mdadm 4.1-1) - this is not a proof, but a hope that it is solved! Let's hope nobody will never complain for such a problem starting from this version.

----------------------------------------------------------------------
Given the fact RAID is a pretty fixed standard for a long time already, seeing every subject into the mailing list archive (unknown errors that nobody understands clearly, reverts, patches everywhere, deadlocks, null pointers, RFC...), after so much development years about mdadm, tends to indicates that mdadm isn't converging into something really stable. Unless it's really slow but progressing the right way ?

Only a precise bug tracking system would have the ability to provide that information in a dependable way

Also as it's about a critical feature aiming to protect valuable data as much as possible (even if it's not a sufficient backup in case something is overwritten or several disks are lost), changes that are not proved to be
 * mandatory
 * proven to solve a really existing bug or problem,
* proven to have no chance of adding any new bug or any unnecessary complexity should probably better be forbidden in such a critical functionality. In the other hand I also understand that freedom is an essential part of this kind of projects, but here this is quite critical!
----------------------------------------------------------------------

Anyway that's may be all for this case, still thanks for a work that I'm still happy to be able to use (yet a little more cautious than before!).

I'll probably propose some bits of new documentation and helps about serious failures recovery into the wiki, using some knowledge and observations I earned during my searches and tests (after ensuring that none of those bits of information could turn to be wrong, or likely to introduce any wrong behavior or data lost - because this can be really serious and distressing subject).

Best regards
Julien


On 5/1/19 6:13 PM, Julien ROBIN wrote:
Hi folks,

tl;dr : Some more information/confirmation, and mdadm 4.1-1 test coming.

Even if I have some difficulty to reproduce the issue into another computer (it happened only once in a big amount of tests), on the real server, it failed exactly the same a second time during the night, so it seems that this can be repeated more easily on it. This time the ext4 filesystem wasn't mounted.

So I'll upgrade it to Debian 10 which is using Linux 4.19 and mdadm 4.1-1, and do the test again; in order to tell you if the problem is still here.

By the way, doing some tests on another computer, playing --create over an existing array after switching from 3.4-4 to 4.1-1, needs to specify the data-offset, because it changed. If changed and not given, the array filesystem isn't readable until you create it with the right data-offset value.

That is, in case of same failure (is no actual data changed - but mdadm cannot assemble anymore), after the upgrade, the exact sentence for recovering my server's RAID will be :

mdadm --create /dev/md0 --level=5 --chunk=512K --metadata=1.2 --layout left-symmetric --data-offset=262144s --raid-devices=3 /dev/sdd1 /dev/sde1 /dev/sdb1 --assume-clean

It also implies that /dev/sdd1 /dev/sde1 /dev/sdb1 didn't moved (I know what are the associated serial numbers - so it's easy to check). If wrong, the array filesystem isn't readable until you create it with the right positions.

By doing some archeology on the list archive about "grow" subjects, I found this guy who suffered from what looks like the same problem on same Debian 9 too (his thing about inserting the disk to the VM seems not to be a real difference - and he found another way to get it started again).
https://marc.info/?t=153183310600004&r=1&w=2
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=884719

I'll probably keep you informed tonight !



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux