On Mon, Jul 13, 2015 at 8:54 AM, Phil Turmel <philip@xxxxxxxxxx> wrote: > Hi Eddie, > On older kernels without support for --replace, the correct > operation is --add spare then --fail, --remove. Makes sense. That was my original plan, since I didn't know about the replace option. Doing otherwise was a bad decision on my part. To make sure I understand this: 1) If you start out with a 4-drive healthy raid5 array and do add / fail / remove, the "fail" step immediately removes that drive from being an active participant in the array and causes the new drive to be populated with data recalculated from parity, right? 2) The new drive will sit in the array as a "spare" until it is needed, which doesn't happen until the "fail" step? And, 3) The "replace" option, instead, does the logical equivalent of moving all the data off one drive onto a spare but doesn't involve the other drives in a parity recalculation? >> it shouldn't risk the actual data stored on the RAID,should it? > > In theory, no. But the --grow operation has to move virtually every > data block to a new location, and in your case, then back to its > original location. Lots of unnecessary data movement that has a > low but non-zero error-rate. > > Also, the complex operations in --grow have produced somewhat > more than its fair share of mdadm bugs. Stuck reshapes are usually > recoverable, but typically only with assistance from this list. Drive > failures during reshapes can be particularly sticky, especially when > the failure is of the device holding a critical section backup. That all makes perfect sense, thanks. > I don't use systemd so can't advise on this. Without systemd, mdadm > just runs mdmon in the background and it all just works. I can't exactly say I use it by choice. I'd change distros but that would only delay the inevitable. > Growing and shrinking didn't do anything to replace your suspect drive. > It just moved the data blocks around on your other drives, all while > not redundant. I'm confused here. I started the grow 4->5 with a healthy raid5 with 4 drives. One of the four drives was "suspect" in that I expect it to fail at some point in the near future -- but it hadn't yet failed. I thought this grow would give me a raid with four data drives + one parity drive, all working. (And it seemed to.) And then I could fail the suspect drive and go back down to three data drives + one parity. The final output of the shrink certainly agrees with what you say, but I clearly don't understand it. I don't understand how going from 4 healthy drives to 5 healthy drives, and then failing and removing one of them and shrinking back down to 4 drives, ended up with 3 good and one spare. But that is what happened. > It seems there is a corner case where at completion of shrink where one > device becomes a spare, the new spare doesn't trigger the recovery code > to pull it into service. > > Probably never noticed because reshaping a degraded array is *uncommon*. > :-) It would be nice if my error in judgement helps save someone else in the future! If there is any data I can gather from my server that will help, I can get it. Although I won't be reproducing this experiment any time in the future on a server that has any data I care about. But note that I didn't reshape a degraded array. I reshaped a healthy array and ended up with a degraded one. Eddie -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html