Re: SMART detects pending sectors; take offline?

Phil Turmel <philip@xxxxxxxxxx> · Mon, 18 Dec 2017 11:09:52 -0500

Hi Alexander,

On 12/18/2017 10:51 AM, Alexander Shenkin wrote:
> Hi all,
> 
> I'm getting back to this now that I'll have time, apologies for the 
> delay.  So, is the following correct in the case of a read error?

Not quite.

> 1) System tries to read an unreadable sector

> 2) Drive timeout reports unreadable based on drive timeout setting.

> 2a) In this case, mdadm sees the sector is unreadable and rewrites it
> elsewhere on that drive.

No.  MD reconstructs the sector from redundancy (mirror or reverse
parity calc or reverse P+Q syndrome) and writes it back to the *same*
sector.  Since the drive firmware reported an error here, it knows to
verify the write as well.  If the verification fails, the drive firmware
will relocate the sector in the background, invisible to the upper
layers.  As far as MD is concerned, that sector address is fixed either
way.  Relocations are handled entirely within the drive.  MD does not
perform or track relocations.

> 3) If linux hangcheck timer runs out before the drive timeout, then 
> linux aborts the read, logs an error, and mdadm isn't given a chance
> to rewrite elsewhere based on checksums.

No.  The hangcheck timer issue described in your forwarded email is
unrelated.  And MD doesn't use checksums.

Each drive has a device driver timeout, as you note below, found at
/sys/block/*/device/timeout, that linux's ATA/SCSI stack uses to cut off
non-responsive controller cards and/or drives.  If that timer runs out
on a read before the drive reports the read error, the low level
*driver* reports a read error to the MD layer.  MD treats it the same as
any other read error, locating or recomputing the sector from redundancy
as above.  The difference in this case is that the physical drive isn't
talking to the controller (link reset in progress, typically) and the
corrective rewrite of the sector (to fix or relocate within the drive)
is refused, and that write error causes MD to kick out the drive.  And
the pending sector is also left unfixed.

> Given all this, it seems to me that I should now set the hangcheck
> timer to something greater than drive timeout (180 seconds).  Does
> that sound right?  Otherwise, linux will kill the rewrite again, no?

In and of itself, waiting on I/O is not a hang.  So it should not be
applicable.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html