Re: Question about raid robustness when disk fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Goswin von Brederlow wrote:
Is it possible to cancel a SATA/SCSI command that is being executed by
the drive?
(it's probably feasible only with NCQ disabled anyway, but it's easy
to disable NCQ)

Do you want to do that? I would rather have the drive keep trying and
return an error if it can't read so the raid layer rewrites the blocks
causing it to be remapped. I do not want to wait for that but I want it
to happen.
So you want that to happen in the background?
Not that much benefit for that to happen in the background, imho.
Why not just having an error returned after a timeout, and normal MD read-error-recovery procedure kicking in? (recomputation from parity and rewrite of the damaged block)

It's a pity we have to rely on TLER, this narrows the choice of drives
a lot...

I don't. I just acknowledge the limitation and accept the downtime
The time might be so long that MD or the controller can drop the entire drive.
It didn't happen to me but I think I read something like this on this ML...
to
find and remove a broken but not properly failed disk. I use raid so I
don't loose my data when a disk fails, not primarily for availability.
So far I had one case in 10 years where a failing disk took down my
system.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux