Re: SSD based sw RAID: is ERC/TLER really important?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> the recovery time in case of media errors could exceed kernel
> timeouts and possibly kick off the entire drive from the RAID
> set and, in turn, lead to a fault of a RAID5 system upon a
> subsequent error in a second drive.

My understanding seems different:

* The purpose of having a short device error retry period is the
  opposite, it is to fail a drive as fast as possible, in
  workloads where latency matters ( or there is also the risk of
  bus/link resets hitting multiple drives). In those cases error
  retry periods of 1-2 seconds (at most) are common, rather than
  the mid-way "7 seconds" from copy-and-paste from web pages..

* The purpose of having a long device error retry is to instead
  to minimize the chances of declaring a drive failed, hoping
  that many retries succeed. (but note the difference between
  reads and writes).

* It is possible to set the kernel timeouts higher than device
  retry periods, if one does not care about latency, to minimize
  the chances of declaring a drive failed (not the difference
  between Linux command timeouts and retry timeouts, the latter
  can also be long).

> But in the case of SSD drives (where, possibly, the error
> recovery activities performed by the drive firmware are very
> fast) [...]

I guess that depends on the firmware: On one hand MLC cells can
become quite unreliable, especially at higher temperatures,
requiring many retries and lots of ECC, on the other on "write"
allocating a new erase-block is easy, as unlike for most HDDs
with a FTL, SDD sector logical and physical sector locations are
independent. Unfortunately most flash SSD drive makers don't
supply technical information on details like error recovery
strategies.



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux