Re: raid 5 crashed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/06/16 22:01, Wols Lists wrote:
On 02/06/16 00:15, Brad Campbell wrote:
People keep saying that. I've never encountered it. I suspect it's just
not the problem that the hysterical ranting makes it out to be (either
that or the pile of cheap and nasty drives I have here are model citizens).
I've *never* seen a read error unless the drive was in trouble, and that
includes running dd reads in a loop over multiple days continuously.
If it were that bad I'd see drives failing SMART long tests routinely
also, and that does not happen either.

Note I didn't say you *will* see an error. BUT. If I recall correctly,
the specs say that one read error per 10TB read is acceptable for a
desktop drive that is designated healthy. In other words, if a 4TB drive
throws an error every third pass, then according to the spec it's a
perfectly healthy drive.

Yes. We know that most drives are far better than spec, and if it
degrades to spec then it's probably heading for failure, but the fact
remains. If you have 3 x 4TB desktop drives in an array, then the spec
says you should expect, and be able to deal with, an error EVERY time
you scan the array.

No, it really doesn't. Those URE figures say <' 1 in' 10^14, not '= 1' in 10^14. So that's a statistical worst case rather than a "this is what you should expect". In addition, it's not a linear extrapolation, it's a probability.

By that logic I should "expect" to roll a 6 at least once every 6 dice rolls.

You can't extrapolate statistical figures like that. Just the same as you can't calculate drive failures from MTBF figures.

Just perform regular read tests on all drives and periodic array scrubs and you'll be much better off.

I've never had a reported URE on any of my arrays with SAS drvies, most have reallocated sectors. They perform background reads periodically and auto-reallocate anything that is looking dodgy.

SATA drives don't do that, but we can manage that externally with long SMART tests and array scrubs to force rewrite/reallocation.

Just don't go trying to extrapolate from manufacturers probability data. There are plenty of garbage web pages littered around the net where "experts" do that, leading to 'hysterical ranting' about how the world is ending and RAID5 is the devil. Sure RAID5 can be an issue when dealing with a catastrophic drive failure requiring a rebuild if you don't look after your drives, and I use and prefer RAID6 to mitigate that, but it's not the end of the world.


Now, on an interesting, related and completely different note. To get back to the concept of using dd or dd_rescue, I had a thought last night and I've never seen it mentioned anywhere.

When you clone a dud drive using dd_rescue, it creates a bad block log.

The reason we don't like doing this is because when you put the replacement drive back into the array, md does not see the errors and will happily return zero data when it reads any sector that was bad on the old drive.

hdparm has a neat feature called --make-bad-sector. It uses a feature of the ATA protocol to write a sector that contains an invalid CRC, so the drive returns an error when you try and read it. The sector is restored by a normal re-write, so no reallocation or permanent damage takes place.

If we took the bad block list from the dd_rescue, and fed it to hdparm to create bad sectors in all those locations on the cloned disk, md would get a bad sector on read and attempt a recovery rather than returning zero, This would "in theory" cause a re-write of good data back to that disk and minimise the chance of data loss.

This might be a useful "last ditch" recovery method to allow you to bring up an array with a cloned disk and minimise data loss. On the other hand, lets say you are using it to bring up a RAID 5 with 2 failed disks. One completely dead and one that you managed to clone most of. When you extract the data from the running and degraded array, md will pass the read error up the stack when it encounters the bad sectors, allowing your copy or rsync session to log which files are affected as you backup the remaining contents rather than just return silently corrupted files.


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux