Re: devices get kicked from RAID about once a month

Robin Hill <robin@xxxxxxxxxxxxxxx> · Fri, 4 Jun 2010 14:50:31 +0100



On Fri Jun 04, 2010 at 09:30:09AM -0400, Dan Christensen wrote:

> Neil Brown <neilb@xxxxxxx> writes:
> 
> > On Thu, 03 Jun 2010 12:47:39 -0400 Dan Christensen <jdc@xxxxxx> wrote:
> >
> >> That could be useful.  And, as Neil said, if the SATA driver could be
> >> told to use longer timeouts, that might help.  Neil, if you think that's
> >> a good idea, maybe you could put the request in with the SATA folks?
> >
> > It might be a good idea.
> 
> After thinking about it more, I'm not sure I fully understand the
> situation.  
> 
> If I was able to turn on something like TLER on the drives, so read
> errors failed more quickly, what would the raid layer do when it got
> a read error? 
> 
It reconstructs the data and attempts a write.  A write failure will
then fail the drive.

> If the raid layer handles this in a clever way (and I recall some
> discussions about this), e.g. by reconstructing the data and rewriting
> the sector allowing the drive to remap it, then what I don't fully
> understand is why it doesn't also do this when there is a timeout on a
> read.  Is it because timeouts can indicate more serious problems?  Even
> so, wouldn't it be reasonable for the raid layer to give the drive a
> second chance before assuming it has failed?
> 
It does exactly the same on the read timeout.  The problem is that when
it sends the write, the drive is still busy attempting the read, so
ignores the write request (until it's free).  This then times out as
well, so the array assumes the drive has failed.

> These questions are motivated from the following logic.  Since it is
> generally recognized that quicker read errors (e.g. TLER) are good
> for drives in a raid array, *increasing* the SATA timeouts seems like it
> is going in the wrong direction.  Wouldn't it be better to have short
> timeouts, but have the raid layer treat a timeout less seriously?
> 
As has been stated, the RAID layer doesn't have any timeouts.  It's the
SCSI/ATA layer which is timing out the read/write and reporting a
failure to the RAID layer.  If the timeout at this level is increased
sufficiently then either the read will eventually succeed, or it'll
still fail but the write will then succeed (as the drive is no longer
busy) (or the write will fail and the disk is really failed).  The
downside is that the increased delay could run into timeouts at other
levels, but these are likely to be less severe than randomly failed
drives.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@xxxxxxxxxxxxxxx> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |
Attachment:
pgprDa4QlOC8e.pgp

Description: PGP signature