Re: software raid and ERC

NeilBrown <neilb@xxxxxxx> · Wed, 18 Apr 2012 13:52:10 +1000

On Wed, 18 Apr 2012 11:12:57 +0800 "." <desire@xxxxxxxxx> wrote:

> Apart from the behaviour of the SCSI layer, does the linux software
> raid layer have any concept of timeouts that would cause a drive to be
> kicked when performing a deep recovery cycle?  A storagereview forum
> thread [3] claims that the linux software raid layer does not have a
> concept of timeouts and does not care about ERC.  In a web article [4]
> the major NAS manufacturers that use software raid seem to agree with
> this stance.

Linux software RAID does not have a concept of timeouts.

> 
> On the other hand, how I interpret a previous post from Stefan [5] is
> that the linux raid layer does have its own timeout mechanism that
> will kick a non-responding drive.

That aspect of that post is inaccurate.

> 
> > Without ERC-timeout, the drive tries to correct the error on
> > its own (not reacting on any requests), mdraid assumes an error after a
> > while and tries to rewrite the "missing" sector (assembled from the
> > other disks).  But the drive will still not react to the write request
> > as it is still doing its internal recovery procedure.  Now mdraid
> > assumes the disk to be bad and kicks it.
> 
> Since I can't read code, I'm hoping that this list where software raid
> development takes place would be able to clear up whether
> 
> a.  Do delays caused by deep recovery cycles actually have any direct
> impact on the linux software raid layer, or does it simply issue a
> command to the underlying storage/scsi subsystem and block until there
> is a response?

md/raid in linux simply issues a command and waits for it to complete, either
with success or failure.

NeilBrown

Attachment:
signature.asc

Description: PGP signature