On 01/05/19 02:49, Julien ROBIN wrote: > Sorry for the delay > > On 5/1/19 1:39 AM, Wols Lists wrote:> On 30/04/19 09:25, Julien ROBIN > wrote: >>> I'm about to play the following command : >>> >>> mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdd1 /dev/sde1 >>> /dev/sdb1 --assume-clean >>> >>> Is it fine ? >> >> You clearly haven't read the raid wiki >> https://raid.wiki.kernel.org/index.php/Linux_Raid >> >> It states NEVER EVER use mdadm --create on an existing array unless you >> (or the person helping you) really knows what they are doing. > > > > That's not exact, when I finally decided to pressed "enter" I was way > more sure than the wiki about what I was doing ;) > > mdadm failed despite the fact I entirely read (and respected) the wiki > several times, longs times before, and last night too. I finally got the > information about : > * how to determine the correct parameters when you use --create > * what are the exact conditions that should be met to use --create to > reconfigure an access to an existing array (when mdadm or something else > blocked/misconfigured the array but data are still available in a > predictable way). Some others case aren't recoverable with --create. > > I found most of those informations elsewhere on the Internet and by > doing my own tests - there is a lot of things that can be understood and > explained which would be useful into the wiki. Most interesting parts > are the cases in which --create can badly rewrite some data on a disk > (game over), and on which disk (so that in some of those cases, others > untouched disks may even be used to reconstruct the destroyed one using > --create with correct parameters - so that the game wasn't really over). > > But unfortunately, all of those things aren't into the wiki. > Well, just to say that I knew what I was doing, as the wiki asked. > Understandably, I'm a little wary of including stuff on the wiki I don't understand myself. If there's a case study happens on the list I try and write it up, but so far I've never had to recover a raid myself, and with an unrelated full-time job and family, spare time to play is hard to come by :-) But really, --create should never be necessary. Like in this case, a "revert reshape" should have worked. > >> Another thing it says is always update to the latest mdadm. I don't >> remember you telling us what version you're using, and the problem you >> describe sounds very much like something I suspect has been fixed in the >> latest versions. > > Yes sorry for the delay, I forgot to say it into my previous posts - It > is running Debian 9 (last apt update/upgrade : mdadm 3.4-4+b1 and linux > 4.9.0-9-amd64) > > I guess that if it was a known bug, corrected some time ago, Debian > would have included the patch into Debian 9? If I'm true, it means that > the problem may still exist upstream. If I'm wrong, and Debian "mdadm" > version is unstable, I won't feel really comfortable using "master / sid > / experimental" branches for "safety and stability" (that would be > really uncommon). > Unfortunately, like so many things, I think it's more like "this problem has gone away", rather than "this problem has been found and fixed". There've been quite a few fixes recently where deadlocks, incorrect states, and similar have been fixed and this problem is almost certainly one of those. So it might have been fixed as a side effect of something else. So there's quite a good chance debian mdadm has not had the fix back-ported. I understand why they don't want to upgrade the version, but really for a program like this they should. It's simple, and linux-specific, and backwards-compatible, so it shouldn't cause any problems. > > By the way, many thanks for your answers. Would be glad if my case > helped you to find something to improve into mdadm - sorry if not! > Thanks. Hopefully, we'll improve things so we don't get any cases like this :-) When pigs fly :-) Cheers, Wol