Re: [PATCH] raid456: avoid second retry of read-error

Nigel Croxon <ncroxon@xxxxxxxxxx> · Tue, 5 Nov 2019 17:46:04 -0500

On 11/4/19 7:33 PM, Wols Lists wrote:
On 04/11/19 20:01, Nigel Croxon wrote:
The MD driver for level-456 should prevent re-reading read errors.

For redundant raid it makes no sense to retry the operation:
When one of the disks in the array hits a read error, that will
cause a stall for the reading process:
- either the read succeeds (e.g. after 4 seconds the HDD error
strategy could read the sector)
- or it fails after HDD imposed timeout (w/TLER, e.g. after 7
seconds (might be even longer)
Okay, I'm being completely naive here, but what is going on? Are you
saying that if we hit a read error, we just carry on, ignore it, and
calculate the missing block from parity?

If so, what happens if we hit two errors on a raid-5, or 3 on a raid-6,
or whatever ... :-)

Cheers,
Wol

This allows the device (disk) to fail faster.  All logic is the same.

If there is a read error, it does not retry that read, it calculates

the data from the other disks.  This patch removes the retry.

-Nigel